In the previous part of this series, we began the tale of the serverless microservice we created in ironSource atom to help us manage the configuration of our customers’ big-data pipelines. We discussed the huge benefits we get from working with AWS Lambda and API Gateway in terms of availability and general peace of mind in terms of our production environment. But we also reviewed some pain points in the development process. These pain points set us on a course to to build an automated flow of deploying a serverless micro-service.
In the first part of this trilogy (you are now reading “The Empire Strikes Back”) we discussed setting up proper unit-tests for our Lambda functions in Node.js, so we can have a good time developing on our own machine.
In this part, we will describe some tools we adopted and practices we’ve developed for actually interacting with AWS Lambda, since working directly with Amazon’s UI can be cumbersome and error-prone.
First of all, several tools exist in the market today to help streamline the process of working with Lambda, ranging from small utility libraries to full blown solutions like “Serverless” (now in beta). When evaluating different solutions we chose Apex. Its minimal approach was appealing to us, it was well documented and seemed to have a decent sized community of developers (currently over 3.5k stars on GitHub).
Apex is simple to use and well-documented, so I won’t go into details here on how to set up or get started with it. Instead, I’d rather just go over how we use it in our team. Apex relies on some pretty straightforward JSON files to describe configuration and on a pre-defined structure for your project. In this post, we’ll be working on a hypothetical microservice which allows others to search for repositories in GitHub. This service will expose a single endpoint: “GET /repos” and receive input in the query string like “/repos?q=rotemtam”. Working with Apex, our sample project structure would look something like this:
Apex sample project structure (view the source here)
When we run “apex deploy” Apex will create new versions of our functions in our AWS account and update the configuration.
TJ Holowaychuk (one of Apex’s creators) recently published an article on Medium with some “Do’s and Don’ts of AWS Lambda”, one of which is very relevant to our topic: “Don’t substitute FaaS with writing good libraries”, which basically means that your actual Lambda functions should be very small. You should take advantage of the package/module system in your language to encapsulate whatever logic you need in an ordinary module and use it. A general rule of thumb would be: Handle input and high-level flow-control in the Lambda function, leave all the rest to your libraries. This way you can easily re-use them in other contexts.
Remember our hypothetical microservice for searching GitHub? Well, assume it contained a Lambda function which would take a string q as input and use that to search GitHub for repositories with that string in their name. There is no reason to include the call to the GitHub API in our Lambda function. Instead we could do something like this:
Which would rely on a “Repository” module like this:
By default, Apex will take whatever is in the function directory, put it in a zip file and upload to Lambda. So basically, if you want to include any modules you wrote or 3rd party stuff from NPM, you need to include it there. That works fine for a project with only one function, but what if you have several functions that share modules? Would you have a Node.js package.json file for each function and manage the dependencies per function? Copy and paste your custom modules into each folder?
Fortunately, Apex lets you customize your build process by specifying life-cycle hooks, namely custom “build” and “clean” commands. We take advantage of this to make use of Webpack to create our deployable artifacts. At build time, webpack will intelligently create a single Javascript file which apex will zip up and deploy. Here’s an example Apex project.json file taken from our example repo:
The only caveat with this method is that the official AWS Javascript SDK doesn’t play nice with Webpack. Check out this discussion to learn more about it. Luckily for us, it turns out that the SDK comes pre-bundled in the Lambda context (super thanks to Victor Delgado for pointing this out), so we can basically tell webpack to ignore it in our webpack config:
The next part of our process is to hook this up to a CI/CD service. We currently use Jenkins and Codefresh for our builds, but this could easily be done with any other service. Let’s list our requirements for a build job for our microservice:
No truly hip deployment process would be complete without using Docker somewhere, right? Seriously though, there are many advantages to using Docker in your build process.
People usually consider Docker containers for running production tasks. Docker containers really are a great way to have complete control over the environment which your code runs in. With Docker you can have an exact replica of your production environment. But the same benefit is true for doing any other type of task, really. By running CI/CD tasks in Docker, you have complete control over the way things happen and you can use whatever tools you like. I usually find that it’s just easier (and faster) for me to run my builds inside Docker containers. Here’s a Dockerfile which I’d use for a task like we described above:
Our build script might look like this:
Lets break it down:
We take this build script and plug it into whatever system we’re using to build our code, and BAM! We have an automated CI/CD flow for working with AWS Lambda!
See you soon!