thundra.io which is a monitoring tool for AWS Lambda allows to have metrics and tracing on deployed Lambdas. It also has a keep warm feature to limit the number of cold start in a serverless web application. Thundra is not the only tool/framework that allows implementing keep warm, it also exists with serverless and Zappa. As described in this article, keep warm is not intended to avoid all cold starts but to minimize its impact in the case of a web serverless application. Thundra's approach to keep-warm requires playing on 2 parameters that are: the number of Lambda in the Lambda pool in keep-warm and the time lapse between calls to keep the functions on. But as said previously, you have no way to control very precisely the number of deployed instances of your Lambda, you just try to keep several instances warm without a good control on how they are really deployed.
By default, Thundra will try to create a pool of 8 Lambdas and send an event every 5 minutes to try to keep the pool warm.
To create a lambda pool and make sure to keep them on, the Lambda in charge of keeping warm must make calls to the target Lambdas. The Lambda code in the pool must handle keep warm requests differently. In this case, the code triggers a sleep so that the Lambda appears busy and the next request is sent to another lambda in the pool. By retrieving the id of each Lambda, it is possible to ensure that all Lambdas in the pool are kept on.
With this configuration, on 5 concurrent calls of 5 requests, Thundra records 5 cold starts while with keep-warm the Lambdas do not suffer it.
Here are the details of the invocations with and without cold start using this technique and Thundra tool.
Finally, cold-start has an important impact on serverless applications for users. Cold-start can be minimized thanks to the keep warm technique but it still requires efforts from developers to optimize this technique. The tools increasingly integrate the implementation of keep-warm in an automated way.
By writing this blog post, those solutions are more workaround than a real solution. Nothing is really mature around pools of lambdas. On the other hand, the ecosystem is growing very fast, there are new tools every day or so to help you working with Lambda. So we could expect more sustainable solutions in a very soon future.
To conclude, and as a personal note, if you need a fixed size of instances of lambda as we just described, it might be more interesting to take a look at other runtimes especially if you already have many users. If you use Kubernetes for example (managed by cloud providers or not), you will be able to fine-tune the autoscaling mechanism to have a minimum/maximum of containers of a same deployed code, configure your CPU and memory resources...
References: Articles:
Videos: