With serverless, you delegate the responsibility of running your infrastructure to a platform provider as much as possible. This frees your engineers to focus on building what your customers want from you — the features that differentiate your business from your competitors’. For this philosophy to work, however, the platform needs to not only give you the tools to build those features, but also deliver the performance that will keep engineers satisfied.
Yes, services such as AWS Lambda are powerful, but they have a number of well-documented limitations that make them unsuitable for some workloads. These include:
Enter Nuclio, an open source serverless platform that is built on top of Kubernetes and comes in both managed platform-as-a-service (PaaS) and self-hosted flavors. In this post, we will compare Nuclio with AWS Lambda, see how Nuclio addresses the above limitations and explore the new use cases Nuclio is unlocking, such as:
One of the biggest shifts from traditional application development to AWS Lambda is how concurrency is managed. An Express.js web application running in a container/VM can handle multiple requests concurrently. In other words, your application manages its own concurrency. You scale the application by increasing the number of concurrent requests it can handle (scaling up) and then increasing the number of containers (scaling out).
With AWS Lambda, concurrency is managed by the platform, and a concurrent execution would process only one request at a time — much like an actor in the actor model, which would process one message at a time. The application can scale out by increasing the number of concurrent executions of the function.
Lambda’s concurrency model has its advantages:
Still, Lambda also has some disadvantages:
Nuclio, on the other hand, can scale up as well as out. A function processor (analogous to a container) can host multiple function workers. Each worker is then able to process one message at a time. This system supports concurrency within the same processor, and then autoscales the number of replicas of the processor based on load.
Figure 2: The architecture of a Nuclio function.
Another interesting difference between Lambda and Nuclio is that Nuclio lets you specify both the minimum and maximum number of replicas. And as an aside, Nuclio also supports GPUs.
Figure 3: Nuclio lets you configure both the minimum and maximum number of replicas.
With Lambda, you can configure a reserved concurrency for a function, which (counterintuitively) sets its max concurrent executions. But there is no way to tell the system to always keep a certain number of concurrent executions running at all times. The system always scales to zero when there is no traffic — behavior that is highly undesirable for systems that experience regular spikes in traffic.
For example, my employer, DAZN, is in the sports streaming business. We see huge spikes in traffic all the time, as millions of users flood in seconds before a sporting event starts.
Figure 4: Sports streaming platforms, such as DAZN, experience huge spikes in traffic just before a sporting event kicks off.
Without the ability to configure the minimum number of concurrent executions, these spikes result in large numbers of cold starts. Also, in these cases, Lambda has to scale out the number of concurrent executions quickly to meet the surge in demand. Here, we also run into the 500/minute limit on how quickly Lambda is able to scale out. As such, we are currently not able to use Lambda on the critical path of our system, which has to shoulder these spikes.
With Nuclio, you can programmatically update the minimum replica count prior to the event so that when the spikes come, you don’t have to worry about hitting scaling limits or enduring the performance hit from cold starts. You can also tell Nuclio to scale to zero immediately via API calls, which saves resources.
Figure 5: Adjusting the minimum number of replicas enables you to deal with predictable spikes in traffic.
Another major difference between Lambda and Nuclio is how timeout is handled. With Lambda, an invocation can run for up to 15 minutes. While it’s possible for you to use recursive functions or step functions to extend this limit, both approaches have their own problems. As such, Lambda functions are great for event-driven architectures and performing short, ephemeral tasks.
With Nuclio, there is no max execution time. The containers can keep on running, which in turn allows you to go beyond the constraints of the event-driven model. You can now turn a function into a long-running service, which unlocks some interesting use cases such as long-running ETL jobs and training machine learning (ML) models.
Both Lambda and Nuclio functions accept an event and context during an invocation.
Figure 6: Both Nuclio and Lambda functions accept an invocation event and context as arguments.
With Lambda, the context object is ephemeral and does not persist between invocations. If you want to persist data between invocations (e.g. static configurations, database connections), then you need to declare them as global variables outside of the function handler.
With Nuclio, the execution context itself is persisted between invocations and can be used to cache state. The context object also includes a built-in logger, which supports structured logging with JSON and four log levels: DEBUG, INFO, WARN and ERROR.
You can configure the default log level for a function and can even override this default log level per invocation — which is very useful for debugging. You also have the ability to export your logs to external services such as Elasticsearch.
Figure 7: With Nuclio, you can select the default log level for every function.
Figure 8: You can also override the default log level for an invocation.
In addition, there are hooks for performing context initialization, which is called before the first invocation on the function. This allows you to perform initialization logic before the container is put into active use and removes the dreaded cold start problem that Lambda suffers from.
With Lambda, functions only run when they are triggered by events. Furthermore, concurrent executions are garbage-collected when they have been idle for a few minutes. This behavior makes life difficult when you are working with relational database management systems (RDBMS) and other systems that require persistent connections. Indeed, a set of guidelines has been developed to help you avoid the many pitfalls of using Lambda with RDBMS. Furthermore, Lambda doesn’t allow you to use persistent data in standard file system mounts, which forces you to copy to/from object storage before working on large files such as images, logs, ML models, etc.
With Nuclio, however, you can use the context object to maintain persistent connections to databases. Since the context is maintained at the processor level, you get better reuse, as they are shared across invocations. You can also take care of other aspects of data-fetching with data bindings, including batching and caching, which helps improve IO performance of the application. Additionally, you can mount volumes to functions, which is useful for working with ML models or Tensorflow, and you can mount Kubernetes secrets as volumes. Indeed, Nuclio supports all of the volume types that Kubernetes supports.
Both Lambda and Nuclio support popular languages such as Go, Node.js, Python, .Net Core, Java and Ruby. Also, both platforms offer a way for you to customize the execution runtime.
With Lambda, you can create a custom runtime and distribute it through AWS Lambda Layers. This lets you introduce additional language runtimes that are not natively supported by the Lambda platform. Several vendors have published runtimes for PHP, Rust, Erlang and Elixir, to name a few.
With Nuclio, you can run functions on your own Docker image (see Figure 9), which can come from a private image repository. This allows you to tailor the execution environment itself.
You can also support additional language runtimes through a Shell function, which allows you to handle invocation events with any executable binary. See this example for more details.
Figure 9: Nuclio lets you run functions on top of your own Docker image, which can come from a private image repository.
Lambda is supported by a wide range of event sources.
Nuclio ships with 13 triggers, including cron, HTTP, Kafka, Kinesis and RabbitMQ. Since Nuclio is open source, you can also write your own trigger for services that you want to integrate with and leverage other people’s contributions. The HTTP trigger gives you an easy way to integrate Nuclio with other event sources that support HTTP as target, such as SNS. Also, since Nuclio supports the CNCF CloudEvents standard, it can arguably support many more event sources that are CloudEvents-compliant.
Nuclio triggers are all normalized to the same Event object. This removes the need to understand the specific event signature for each event source, which is often confusing when working with Lambda. It also makes it easy to switch a function between different triggers.
With Lambda, you need to use API Gateway to create an HTTP endpoint for your functions. API Gateway is a feature-rich service, but it’s also complicated and often costs more to run than the Lambda invocations themselves. It also adds another source of latency to your API, which, at the minimum, is around 5ms to 10ms, and it can overhead spike to more than 100ms. This cost and latency overhead make it unsuitable for applications with high throughput or hard real-time requirements.
With Nuclio, every function gets a private HTTP interface by default. To expose the endpoint publicly, you need to specify an HTTP trigger similar to API Gateway. However, unlike API Gateway, this HTTP trigger does not incur extra costs and has minimal latency overhead.
Figure 10: With Nuclio, every function gets a private HTTP interface by default.
As you can see from our comparison, Nuclio differs from Lambda in a number of important areas, such as its concurrency model and that it has no max execution time. This opens the door to a whole range of use cases. Let’s take a moment to look at a few.
For a high-throughput API, the cost of both API Gateway and Lambda is drastically higher than an equivalent application running in containers. This cost discrepancy has lead many to rewrite their applications.
Nuclio’s concurrency model makes more efficient use of available resources, which significantly reduces operational cost when running at scale. Functions have built-in HTTP interfaces so you also don’t need to pay for an expensive API Gateway service.
The ability to mount a volume to your function also lets you read and write data to/from a mounted volume at high speed and enables building stateful applications. Again, you can’t do this with Lambda today, as it doesn’t allow you to attach Amazon Elastic File System (Amazon EFS) volumes to functions.
These features make it economically feasible to run high-throughput and high-performance APIs on Nuclio functions. The same can also be said about high-throughput data processing pipelines that have to process multiple terabytes of data per hour.
Nuclio gives you the ability to configure a minimum number of replicas. This allows systems that experience predictable spikes in traffic to reserve sufficient resources ahead of time and avoid degrading the user experience when the spikes happen. Examples include food ordering services such as Just Eat or Deliveroo or sports streaming services such as DAZN.
At the same time, Nuclio still lets you scale to zero by setting the minimum number of replicas to zero. It gives you the control to optimize for resource usage or latency, depending on the situation.
Nuclio does not suffer from cold starts and is able to deliver consistent and sub-millisecond response times on invocations. This makes it a suitable solution for applications with a hard, real-time requirement such as multiplayer games and real-time bots.
To serve ML models in real time, you need to have both strong and predictable API latency and the ability to load and work with ML models that often are large in size (GBs).
Nuclio lets you attach Kubernetes volumes to your functions, and there are no size limits on these volumes. Combined with the performance characteristics, it’s possible to implement these demanding workloads with Nuclio functions.
Nuclio also has native integration with Jupyter, and lets you automatically deploy Jupyter notebooks and ML models as Nuclio functions.
Nuclio does not impose a max execution timeout on function invocations. This allows you to turn your functions into long-running services and perform long-running ETL jobs.
Nuclio’s init_context() hook lets you create long-running services, such as an application that constantly reads off a Twitter feed (e.g. loads the TwythonStreamer with the context, which is polling an external service, versus becoming triggered by an event).
As we discussed, Nuclio is one of the few open source solutions that are also business-viable. It has key architectural advantages for higher performance or data-driven applications, including:
If you are looking for a serverless platform that is not tied to a specific cloud provider, Nuclio is worth a look.
Hi, my name is Yan Cui. I’m an AWS Serverless Hero and the author of Production-Ready Serverless. I have run production workload at scale in AWS for nearly 10 years and I have been an architect or principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. I currently work as an independent consultant focused on AWS and serverless.
You can contact me via Email, Twitter and LinkedIn.
Come learn about operational BEST PRACTICES for AWS Lambda: CI/CD, testing & debugging functions locally, logging, monitoring, distributed tracing, canary deployments, config management, authentication & authorization, VPC, security, error handling, and more.
You can also get 40% off the face price with the code ytcui.
Get your copy here.
Originally published at theburningmonk.com on April 4, 2019.