In this post we’ll walk through different AWS services and features that enable canary deployments of Lambda Functions , although you can check the Canary Deployments Serverless Plugin if you just want to safely deploy your functions and you are not interested about the details. Deployment in a Serverless application is an all-at-once process, when we release a new version of any of our functions, every single user will hit the new version. We must be really confident about the new version, because if anything goes wrong and the function contains an error, all of our users will be experiencing ugly issues. However, AWS recently introduced a new feature that can make our deployment process much more reliable and secure: traffic shifting using aliases. How can alias traffic shifting help us? Usually, deploying a Lambda function involves that all the function invocations will execute the new code, either because we are updating or because we are pointing an alias to a new version. That means that if anything goes wrong, 100% of the invocations will be errored and we should quickly roll back to the previous version or, what is even worse, we might not notice the bug and we leave our system in an inconsistent state. However, with the introduction of alias traffic shifting, we can now specify on an alias, so that not all the invocations hit the new release, but only a certain amount of traffic is routed to the latest version. This means that Lambda will automatically between the two latest versions, allowing us to check how the new release behaves before completely replacing the previous one, minimizing the impact of a possible bug. $Latest version weights load balance requests As we see that the new version behaves correctly, we could then gradually update its weight, increasing the load it receives. We can do that in the AWS Console, through the or with some , but there’s a better way to handle Lambda function deployments. CLI open source tools that handle weight updates automatically Automating the deployment process Even though being able to do traffic shifting is a huge leap forward, let’s admit that having to update weights manually (or deploying our own management system) is not really convenient. Don’t worry, as usual AWS has us covered. With we can just specify how we want traffic to be shifted over the time and it the . There are three different types of deployment preferences: CodeDeploy automatically adjusts weights : we specify the percentage of traffic we want to shift and the time we want the deployment to last. So, if we pass 10% and 30 minutes, for example, the 10% of the traffic will be routed to the new version during half an hour. When that time has passed, all the traffic will be shifted to the new version. Canary : the amount of the traffic routed to the new version will be incremented according to the provided percentage and interval. So, if we configure it to increment a 10% of the traffic every 5 minutes, CodeDeploy will update the alias weights adding a 10% to the new version in intervals of 5 minutes, until all the traffic has been shifted. Linear : all the traffic is shifted to the new version straight away. All-at-once This way, we let CodeDeploy do all the heavy lifting in the deployment process, and change alias weights according to our preferences. Traffic will be shifted gradually and, in the meantime, we can check if our new function is behaving correctly and cancel the deployment if we see anything weird. How could we automate the roll back process, so that we don’t have to manually check how the system is performing? CodeDeploy has thought about that as well. Rolling back to the previous Lambda Function version CodeDeploy allows us to configure a and a traffic shifting , which are in fact Lambda functions that are triggered before and after the traffic shifting process. They’re suited for performing tasks like running . CodeDeploy expects to get notified about the success or failure of the hooks , otherwise it’ll assume they failed. In any case, if the hook failed either for not calling CodeDeploy or for explicitly calling it with a failure response, the deployment will be aborted and all the traffic will be shifted to the old function version. pre post hook integration tests within one hour Hooks are not the only way we can check that our function is behaving as expected, since we can provide CodeDeploy with a list of to monitor the deployment process. As the traffic shifting begins, CodeDeploy will track those alarms, cancelling the deployment and to the previous function version if any of them is . CloudWatch Alarms rolling back triggered Deployment process with CodeDeploy and Lambda weighted alias Hooks and alarms allow us to monitor the whole deployment process. We can perform some tests before routing any traffic to the new function version, track alarms during the traffic shifting process and run more tests right after all the traffic is hitting the new version. If CodeDeploy notices something wrong in any of those steps, it will automatically roll back to the old, stable function version. CloudFormation all the things All that sounds really good, but how do we set it up? Well, doing it manually doesn’t seem to be a reasonable option, but fortunately AWS has an awesome service for defining and provisioning infrastructure: . This tool gives us a way to model our system in YAML or JSON templates, which we can use to create a collection of related AWS resources. As you can guess, the syntax of a template where we can define almost any imaginable resource in the AWS ecosystem and its configurations it’s intricate. They even built a simplified syntax to define Serverless applications, the , although it only supports a tiny subset of the AWS resources. The best way to deal with CloudFormation in a Serverless environment is, hands down, the , that provides a nice an easy DSL, which then turns into a CloudFormation template, so that we don’t have to deal with its complexity. However, when the DSL falls short, we can still include chunks of CloudFormation template syntax to create the resources we need. It turns out that the framework has not implemented the canary deployments feature, so we’ll have to specify the resources ourselves. So, we’ll need to include the following: CloudFormation Serverless Application Model Serverless framework A . CodeDeploy::Application An with and permissions for CodeDeploy. IAM::Role AWSCodeDeployRoleForLambda AWSLambdaFullAccess A for every function, where we’ll specify the deployment preference type and alarms. CodeDeploy::DeploymentGroup A for every function including , where we specify the CodeDeploy Application and DeploymentGroup it belongs to, and the associated hooks. Lambda::Alias CodeDeployLambdaAliasUpdate The Serverless Framework always triggers the Lambda function version upon any event, so we must replace any reference to the function by the newly created alias in the event sources. $Latest If this sounds like a lot of hassle… it’s just because it is. Luckily, the Serverless Framework is really modular and there are tons of plugins to complement its features, so you can use the to create all those resources in a much more convenient way (Note: I’m the author of the plugin, any contribution, comment or feature request is welcome). Serverless Plugin Canary Deployments Happy safe deployments! : Reference _By default, an alias points to a single Lambda function version. When the alias is updated to point to a different…_docs.aws.amazon.com Traffic Shifting Using Aliases - AWS Lambda _This post courtesy of Ryan Green, Software Development Engineer, AWS Serverless The concepts of blue/green and canary…_aws.amazon.com Implementing Canary Deployments of AWS Lambda Functions with Alias Traffic Shifting | Amazon Web… _If you use AWS SAM to create your serverless application, it comes built-in with AWS CodeDeploy for safe Lambda…_docs.aws.amazon.com Gradual Code Deployment - AWS Lambda