One of the benefits of serverless is the pay-per-use pricing model you get from the platform. That is, if your code doesn’t run, you don’t pay for them!
Combined with the simplified deployment flow (compared with applications running in containers or VMs) it has enabled many teams to make use of temporary CloudFormation stacks.
In this post, let’s talk about two ways you should use temporary CloudFormation stacks, and why.
Disclaimer: this shouldn’t be taken as a prescription. It’s a general approach that has pros and cons, which we will discuss along the way.
It’s common for teams to have multiple AWS accounts, one for each environment. While there doesn’t seem to be a consensus on how to use these environments, I tend to follow these conventions:
dev
is shared by the team, this is where the latest development changes are deployed and tested end-to-end. This environment is unstable by nature, and shouldn’t be used by other teams.test
is where other teams can integrate with your team’s work. This environment should be fairly stable so to not slow down other teams.staging
should closely resemble production, and would often contain dumps of production data. This is where you can stress test your release candidate in a production-like environment.production
Having one account per environment is considered best practice. Some of my AWS friends would go even further and have one AWS account per developer.
Personally, I have never felt the need to have one account per developer. After all, there is some overhead for having an AWS account. Instead, I usually settle for one AWS account per team per environment.
However, for a particular project, I would often have a separate deployment CloudFormation stack per developer (but in the same dev account). This is especially useful when I’m working on feature branches.
When I start work on a new feature, I’m still feeling my way towards the best solution for the problem. The codebase is still unstable and many bugs haven’t been ironed out yet. Deploying my half-baked changes to the dev environment can be quite disruptive:
Some of these challenges can be mitigated with feature toggles. Different feature branches can be live at the same time and only enabled for the developers working on them. Services like LaunchDarkly is great for this, even if using it with Lambda requires extra work.
But what do you do when you need to deploy and test some unfinished changes? That is, the change is not complete and not yet ready to be reviewed and merged back into
master
. Feature toggles won’t help you in this case.Instead, I can deploy the feature branch to a dedicated environment, e.g.
dev-my-feature
. Using the Serverless framework, that is as easy as running the command sls deploy -s dev-my-feature
. This would deploy all the Lambda functions, API Gateway and any other related resources (DynamoDB, etc.) in its own CloudFormation stack. I would be able to test my work-in-progress feature in a live AWS environment.Having these temporary CloudFormation stacks for each feature branch have negligible cost overhead. There is no traffic in the
dev
account since it’s only used by the team. When the developer is done with the feature, the temporary stack can be easily removed by running sls remove -s dev-my-feature
.However, since these temporary stacks are an extension of your feature branch. They exhibit the same problems when you have long-lived feature branches. Namely, they get out-of-synch with other systems they need to integrate with. Both in terms of the events coming into your function, such as the payloads from SQS/SNS/Kinesis, etc. As well as data your function depends on, such as those that reside in DynamoDB.
While it’s not a problem with serverless technologies per se, but I find teams move faster when they use serverless. Which means the problems with long-lived feature branches become more prominent and noticeable as well.
Don’t leave feature branches hanging around for more than, say, a week. If the work is large and takes longer to implement, then break it up into smaller features. And when you’re working on a feature branch, integrate from master regularly (no less than once per day).
Instead of spending lots of time to get tools such as localstack working, I find it much more productive to deploy a temporary CloudFormation stack in AWS and run against the real thing.
The main downsides are:
The internet access argument is only relevant for a handful of people who spend most of their time on the road. I travel more than most and do a lot of work while I’m at airports, and internet access is rarely a problem for me.
As for the slower feedback loop, it probably feels worse than it actually is. Most of my deployments are less than 30 seconds. But they do feel like an eternity when I’m staring at the screen and waiting for it to be done. To compensate for the loss of feedback loop, I also use tests as well as
sls invoke local
to run my functions locally while talking to the real AWS services.Speaking of testing, another common use of temporary CloudFormation stacks is for running end-to-end tests.
One of the common problems with these tests is that you need to insert test data into a live, shared AWS environment. As a rule of thumb I always:
These help keep my tests robust and self-contained as they don’t implicitly rely on data to exist. They also help reduce the amount of junk that is floating around in the shared
dev
environment too.However, despite our best intentions, mistakes happen and sometimes we deliberately cut corners to gain agility in the short-term. Over time, these shared environments are filled with test data, which at times interfere with normal operations.
As a counter-measure, many teams would employ cron jobs to wipe these environments from time-to-time.
An emerging practice to combat these challenges is to create a temporary CloudFormation stack during the CI/CD pipeline instead. The temporary stack is used to execute the end-to-end tests and destroyed afterwards.
This way, there is no need to clean up test data, either as part of your test fixture or with cron jobs.
You should weigh the benefits of this approach against the delay it adds to your CI/CD pipeline and decide if it’s right for a project. Personally, I think it’s a great approach and I’d encourage more teams to adopt it. However, the more data you have to clean up in external systems (that is, systems that are not provisioned as part of the temporary stack) the less useful it becomes.
In summary, here are two ways you can use temporary CloudFormation stacks to improve your development flow for serverless applications:
These approaches should not be considered as a prescription. You need to consider their pros and cons and see they fit with your constraints and how your teams work.
Hi, my name is Yan Cui. I’m an AWS Serverless Hero and the author of Production-Ready Serverless. I have run production workload at scale in AWS for nearly 10 years and I have been an architect or principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. I currently work as an independent consultant focused on AWS and serverless.
You can contact me via Email, Twitter and LinkedIn.
Check out my new course, Complete Guide to AWS Step Functions.
In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. Including basic concepts, HTTP and event triggers, activities, design patterns and best practices.
Get your copy here.
Come learn about operational BEST PRACTICES for AWS Lambda: CI/CD, testing & debugging functions locally, logging, monitoring, distributed tracing, canary deployments, config management, authentication & authorization, VPC, security, error handling, and more.
You can also get 40% off the face price with the code ytcui.
Get your copy here.
Originally published at https://theburningmonk.com on September 12, 2019.