Daily DynamoDB Backups With Serverless
During re:Invent 2017, Amazon announced a cool feature: on-demand backups of DynamoDB tables. In this post, I will go over how to set up regular automated backups, using the serverless framework.
TL;DR Install the serverless package for scheduled DynamoDB backups
Previously, backing up DynamoDB tables was resource and time consuming. Depending on the solution, it would involve running custom programs on EC2 infrastructure, or even spawn Elastic Mapreduce x-large instances.
Now, with the click of a button, you can get an almost instantaneous backup of petabytes of data. Amazon is able to do this by taking snapshots of the tables (if you are curious, you can read about snapshot technologies).
On-Demand Backups are encrypted with an Amazon-managed key. They include:
- the table data;
- the provisioned capacity settings;
- and the settings for Local and Global Secondary Indexes.
However, they do not include:
- Auto Scaling settings,
- TTL settings
- IAM policies
- CloudWatch metrics and alarms.
You have to be aware that restoring backups will take more time, anywhere from 30 minutes to several hours if you have huge tables.
The Serverless Framework
Setting up Lambdas, AMI permissions, Cloudwatch settings can be a pain. Fortunately, the Serverless framework makes it a breeze!
Focus on your application, not your infrastructure.
Let’s go ahead and install the serverless framework:
npm install -g serverless
Depending on your installation, you might need to sudo.
Then install the code:
The trickiest part is to set up the AWS permissions. First you need to create an IAM profile with the right permissions for serverless. Edit the code from the file
aws-profile.json and replace
<account> by their actual values. The easiest way to get your account number is to go to the Support Center from the top right of the console:
and your account will show up at the top of the page.
Go to the IAM console, and create a new policy with this JSON content. Name it as you wish (like
serverless-admin) and attach this policy to your user.
If you don’t have your access key and secret set up, generate them through IAM and install them on your machine by creating a new profile
aws configure --profile <account_name>
dynamodb-backup-scheduler package is easy to configure. Copy the file
env.yml. The file is self-explanatory. You will specify the list of the tables you want to back up, and for how long to keep the backups.
We’re all set to get going:
serverless deploy -v
Watch the output as serverless creates for you the Lambda function, the IAM role for it and the CloudWatch event.
That’s it! Your scheduled backups are set up! Although, if you’re like me, you will want to test the lambda function now, and not wait for the daily event to fire.
Testing the Lambda
From the Lambda console, click on the function that was just created. It will have a name like
dynamodb-backup-scheduler-production-backupDynamoDBTables. Near the top, create a new test event:
By looking into the source file
serverless.yml, we see that the lamda is sent 3 inputs. We need those in our test event:
Make sure to give real table names in the last parameter. Once the test event is created, and we’re back on the Lambda function page, click the Test button
The execution results will show in the green section. If all is good, you can go to the DynamoDB console, and verify that the backups named
backupTest were indeed created:
Monitoring through SNS alerts
To make the procedure complete and production-grade, there is one thing left: adding an alert when the backups are failing. This is done through the serverless-plugin-aws-alerts plugin. The package allows you to enable this on the production stage by specifying in
env.yml a variable specifying the subscription to an SNS topic that will be created by the plugin.
- protocol: email