Update: with the introduction of Point-in-time recovery backups, this solution is now obsolete.
TL;DR Install the serverless package for scheduled DynamoDB backups
Remember photocopiers?
Previously, backing up DynamoDB tables was resource and time consuming. Depending on the solution, it would involve running custom programs on EC2 infrastructure, or even spawn Elastic Mapreduce x-large instances.
Now, with the click of a button, you can get an almost instantaneous backup of petabytes of data. Amazon is able to do this by taking snapshots of the tables (if you are curious, you can read about snapshot technologies).
On-Demand Backups are encrypted with an Amazon-managed key. They include:
However, they do not include:
You have to be aware that restoring backups will take more time, anywhere from 30 minutes to several hours if you have huge tables.
Setting up Lambdas, AMI permissions, Cloudwatch settings can be a pain. Fortunately, the Serverless framework makes it a breeze!
Focus on your application, not your infrastructure.
Let’s go ahead and install the serverless framework:
npm install -g serverless
Depending on your installation, you might need to sudo.
Then install the code:
git clone https://github.com/unitoio/dynamodb-backup-scheduler.git
The trickiest part is to set up the AWS permissions. First you need to create an IAM profile with the right permissions for serverless. Edit the code from the file aws-profile.json
and replace <region>
and <account>
by their actual values. The easiest way to get your account number is to go to the Support Center from the top right of the console:
and your account will show up at the top of the page.
Go to the IAM console, and create a new policy with this JSON content. Name it as you wish (like serverless-admin
) and attach this policy to your user.
If you don’t have your access key and secret set up, generate them through IAM and install them on your machine by creating a new profile
aws configure --profile <account_name>
The dynamodb-backup-scheduler
package is easy to configure. Copy the file env-example.yml
to env.yml
. The file is self-explanatory. You will specify the list of the tables you want to back up, and for how long to keep the backups.
We’re all set to get going:
serverless deploy -v
Watch the output as serverless creates for you the Lambda function, the IAM role for it and the CloudWatch event.
That’s it! Your scheduled backups are set up! Although, if you’re like me, you will want to test the lambda function now, and not wait for the daily event to fire.
From the Lambda console, click on the function that was just created. It will have a name like dynamodb-backup-scheduler-production-backupDynamoDBTables
. Near the top, create a new test event:
By looking into the source file serverless.yml
, we see that the lamda is sent 3 inputs. We need those in our test event:
Make sure to give real table names in the last parameter. Once the test event is created, and we’re back on the Lambda function page, click the Test button
The execution results will show in the green section. If all is good, you can go to the DynamoDB console, and verify that the backups named backupTest
were indeed created:
To make the procedure complete and production-grade, there is one thing left: adding an alert when the backups are failing. This is done through the serverless-plugin-aws-alerts plugin. The package allows you to enable this on the production stage by specifying in env.yml
a variable specifying the subscription to an SNS topic that will be created by the plugin.
notifications:
protocol: emailendpoint: [email protected]