You should use SSM Parameter Store over Lambda env variablesby@theburningmonk
60,092 reads
60,092 reads

You should use SSM Parameter Store over Lambda env variables

by Yan CuiAugust 13th, 2017
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

<strong><em>updated</em></strong><em>: this post has been updated on 15/09/2017 following the release of Serverless framework 1.22.0 which introduced support for SSM parameter store out of the box. Go to the end of post to see how it compares with other approaches.</em>

People Mentioned

Mention Thumbnail

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - You should use SSM Parameter Store over Lambda env variables
Yan Cui HackerNoon profile picture

AWS Lambda introduced native support for environment variables (with KMS encryption support). But, using AWS Lambda environment variables makes it hard to share config values across functions and to implement fine-grained access to sensitive data (eg. credentials, API keys, etc.)

updated: this post has been updated on 15/09/2017 following the release of Serverless framework 1.22.0 which introduced support for SSM parameter store out of the box. Go to the end of post to see how it compares with other approaches.

AWS Lambda announced native support for environment variables at the end of 2016. But even before that, the Serverless framework had supported environment variables and I was using them happily as me and my team at the time migrated our monolithic Node.js backend to serverless.

However, as our architecture expanded we found several drawbacks with managing configurations with environment variables.

Hard to share configs across projects

The biggest problem for us was the inability to share configurations across projects since environment variables are function specific at runtime.

The Serverless framework has the notion of services, which is just a way of grouping related functions together. You can specify service-wide environment variables as well as function-specific ones.

A sample serverless.yml that specifies both service-wide as well as function-specific environment variables.

However, we often found that configurations need to be shared across multiple services. When these configurations change we had to update and redeploy all functions that depend on them — which in itself was becoming a challenge to track these dependencies across many Github repos that are maintained by different members of the team.

For example, as we were migrating from a monolithic system piece by piece whilst delivering new features, we weren’t able to move away from the monolithic MongoDB database in one go. It meant that lots of functions shared MongoDB connection strings. When one of these connection strings changed — and it did several times — pain and suffering followed.

Another configurable value we often share are the root URL of intermediate services. Being a social network, many of our user-initiated operations depend on relationship data, so many of our microservices depend on the Relationship API. Instead of hardcoding the URL to the Relationship API in every service (one of the deadly microservice anti-patterns), it should be stored in a central configuration service.

Hard to implement fine-grained access control

When you need to configure sensitive data such as credentials, API keys or DB connection strings, the rule of thumb are:

  1. data should be encrypted at rest (includes not checking them into source control in plain text)
  2. data should be encrypted in-transit
  3. apply the principle of least privilege to function’s and personnel’s access to data

If you’re operating in a heavily regulated environment then point 3. might be more than a good practice but a regulatory requirement. I know of many fintech companies and financial juggernauts where access to production credentials are tightly controlled and available only to a handful of people in the company.

Whilst efforts such as the serverless-secrets-plugin delivers on point 1. it couples one’s ability to deploy Lambda functions with one’s access to sensitive data — ie. he who deploys the function must have access to the sensitive data too. This might be OK for many startups, as everyone has access to everything, ideally your process for managing access to data can evolve with the company’s needs as it grows up.

SSM Parameter Store

My team outgrew environment variables, and I started looking at other popular solutions in this space — etcd, consul, etc. But I really didn’t fancy these solutions because:

  • they’re costly to run: you need to run several EC2 instances in multi-AZ setting for HA
  • you have to manage these servers
  • they each have a learning curve with regards to both configuring the service as well as the CLI tools
  • we needed a fraction of the features they offer

This was 5 months before Amazon announced SSM Parameter Store at re:invent 2016, so at the time we built our own Configuration API with API Gateway and Lambda.

Nowadays, you should just use the SSM Parameter Store because:

  • it’s a fully managed service
  • sharing configurations is easy, as it’s a centralised service
  • it integrates with KMS out-of-the-box
  • it offers fine-grained control via IAM
  • it records a history of changes
  • you can use it via the console, AWS CLI as well as via its HTTPS API

In short, it ticks all our boxes.

You can create secure strings that are encrypted by KMS.

You can see a history of all the changes to these parameters.

You have fine-grained control over what parameters a function is allowed to access.

There are couple of service limits to be aware of:

  • max 10,000 parameters per account
  • max length of parameter value is 4096 characters
  • max 100 past values for a parameter

Client library

Having a centralised place to store parameters is just one side of the coin. You should still invest effort into making a robust client library that is easy to use, and supports:

  • caching & cache expiration
  • hot-swapping configurations when source config value has changed

Here is one such client library that I put together for a demo:

To use it, you can create config objects with the loadConfigs function. These objects will expose properties that return the config values as Promise (hence the yield, which is the magic power we get with co).

You can have different config values with different cache expiration too.

If you want to play around with using SSM Parameter Store from Lambda (or to see this cache client in action), then check out this repo and deploy it to your AWS environment. I haven’t included any HTTP events, so you’d have to invoke the functions from the console.

Update 15/09/2017: the Serverless framework release 1.22.0 which introduced support for SSM parameters out of the box.

With this latest version of the Serverless framework, you can specify the value of environment variables to come from SSM parameter store directly.

Compared to many of the existing approaches, it has some benefits:

  • avoid checking in sensitive data in plain text in source control
  • avoid duplicating the same config values in multiple services

However, it still falls short on many fronts (based on my own requirements):

  • since it’s fetching the SSM parameter values at deployment time, it still couples your ability to deploy your function with access to sensitive configuration data
  • the configuration values are stored in plain text as Lambda environment variables, which means you don’t need the KMS permissions to access them, you can see it the Lambda console in plain sight
  • further to the above, if the function is compromised by an attacker (who would then have access to process.env) then they’ll be able to easily find the decrypted values during the initial probe (go to 13:05 mark on this video where I gave a demo of how easily this can be done)
  • because the values are baked at deployment time, it doesn’t allow you to easily propagate config value changes. To make a config value change, you will need to a) identify all dependent functions; and b) re-deploying all these functions

Of course, your requirement might be very different from mine, and I certainly think it’s an improvement over many of the approaches I have seen. But, personally I still think you should:

  1. fetch SSM parameter values at runtime
  2. cache these values, and hot-swap when source values change

Hi, my name is Yan Cui. I’m an AWS Serverless Hero and the author of Production-Ready Serverless. I have run production workload at scale in AWS for nearly 10 years and I have been an architect or principal engineer with a variety of industries ranging from banking, e-commerce, sports streaming to mobile gaming. I currently work as an independent consultant focused on AWS and serverless.

You can contact me via Email, Twitter and LinkedIn.

Check out my new course, Complete Guide to AWS Step Functions.

In this course, we’ll cover everything you need to know to use AWS Step Functions service effectively. Including basic concepts, HTTP and event triggers, activities, design patterns and best practices.

Get your copy here.

Come learn about operational BEST PRACTICES for AWS Lambda: CI/CD, testing & debugging functions locally, logging, monitoring, distributed tracing, canary deployments, config management, authentication & authorization, VPC, security, error handling, and more.

You can also get 40% off the face price with the code ytcui.

Get your copy here.