In this article, I will explain how I built COINDATAX, a cryptocurrency analytics platform to help investors analyze the market, and explain why we choose to go serverless. I will also explain our biggest pain points with AWS Lambda, and how Dashbird helped us with Lambda performance monitoring.
When my co-founder and I decided to build a cryptocurrency web application, we immediately thought about using AWS Lambda for our integrations. After all, since we were a small team trying to create a new product, we didn't want to spend too much time managing AWS servers. Having to select the ideal instance type, configure auto-scaling policies, and creating a deployment pipeline takes a significant amount of work, which we simply could not afford.
In addition, since we would like to add as many integrations as possible to our dashboards, we wanted our application to scale linearly with product usage. With traditional server-based architectures, your environment scales more like a "step" function --- if, say, each instance can handle 100 clients, when you reach 101 visitors you need to spawn a new machine that will be idle most of the time. With serverless architecture, your application scales, up or down, according to each request.
Another benefit of Lambda, or more specifically of the Serverless framework, is infrastructure as code. With serverless, you know exactly what is going on with your infrastructure simply by looking at a configuration file. Although you can do this for regular server-based architectures, using tools such as Terraform, you usually need to think of it as something extra to your infrastructure design, while most AWS applications built on top of Lambda will use serverless from the beginning.
First of all, AWS Lambda has some inherent limitations. You can't have, for example, a long-running or expensive process. If your functions last longer than 5 minutes or if they need more than 1 MB of RAM, you need to rework your architecture in order to use serverless. It might be necessary to use Step Functions or to break your lambdas into smaller functions so that you can pass the processed output from one to the other.
Secondly, Lambda functions are indisputably harder to debug. Since you don't have direct access to the instance where your code is running and since AWS CloudWatch is somewhat limited in its monitoring capabilities, sometimes things break and you have no idea of what happened. This is where Dashbird comes into play, but I'll talk more about that in a moment.
Taking all that into consideration, we decided to move forward with serverless, as we believed these limitations could be managed.
The architecture behind COINDATAX is pretty straightforward. We pull data from a number of external APIs, then we do some cleaning and processing to be sure that all data is consistent and reliable, and finally, we store the events in our databases. These data extractors are triggered via CloudWatch events at fixed intervals so that we have up to date crypto information in our system.
In this article, I will explain how to create a monitoring platform that implements the first step of our product, the data gathering of one of an external API. More specifically, we will go through the following:
Install serverless
$ npm i -g serverless
Configure the serverless.yml
file. In this example we are using Node.js 6 and we have only one function that is triggered every 5 minutes.
service: coindatax-dashbird-demo
provider:
name: aws
runtime: nodejs6.10
functions:
coinmarketcap:
handler: functions/coinmarketcap.handler
events:
- schedule: rate(5 minutes)
Our Lambda function is also simple, as the first step of a monitoring platform is to extract data from external APIs. Here we connect our function to CoinMarketCap's API and get the ticker information of the top coins for that period.
// functions/coinmarketcap.js
const request = require('request-promise');
const uri = 'https://api.coinmarketcap.com/v1/ticker/'
function handler(event, context, callback) {
const options = {
uri,
json: true
}
request(options)
.then(data => callback(null, data)) // insert into database
.catch(err => callback(err))
}
module.exports = {
handler
}
To deploy a serverless application, simply run
$ serverless deploy -v
And check out the output logs
Serverless: Packaging service...
Serverless: Creating Stack...
Serverless: Checking Stack create progress...
CloudFormation - CREATE_IN_PROGRESS - AWS::CloudFormation::Stack - CoindataxDashDashbirdDashDemo-dev
CloudFormation - CREATE_IN_PROGRESS - AWS::S3::Bucket - ServerlessDeploymentBucket
CloudFormation - CREATE_IN_PROGRESS - AWS::S3::Bucket - ServerlessDeploymentBucket
CloudFormation - CREATE_COMPLETE - AWS::S3::Bucket - ServerlessDeploymentBucket
CloudFormation - CREATE_COMPLETE - AWS::CloudFormation::Stack - CoindataxDashDashbirdDashDemo-dev
Serverless: Stack create finished...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service .zip file to S3 (17.23 MB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
CloudFormation - UPDATE_IN_PROGRESS - AWS::CloudFormation::Stack - CoindataxDashDashbirdDashDemo-dev
CloudFormation - CREATE_IN_PROGRESS - AWS::IAM::Role - IamRoleLambdaExecution
CloudFormation - CREATE_IN_PROGRESS - AWS::Logs::LogGroup - CoinmarketcapLogGroup
CloudFormation - CREATE_IN_PROGRESS - AWS::IAM::Role - IamRoleLambdaExecution
CloudFormation - CREATE_IN_PROGRESS - AWS::Logs::LogGroup - CoinmarketcapLogGroup
CloudFormation - CREATE_COMPLETE - AWS::Logs::LogGroup - CoinmarketcapLogGroup
CloudFormation - CREATE_COMPLETE - AWS::IAM::Role - IamRoleLambdaExecution
CloudFormation - CREATE_IN_PROGRESS - AWS::Lambda::Function - CoinmarketcapLambdaFunction
CloudFormation - CREATE_IN_PROGRESS - AWS::Lambda::Function - CoinmarketcapLambdaFunction
CloudFormation - CREATE_COMPLETE - AWS::Lambda::Function - CoinmarketcapLambdaFunction
CloudFormation - UPDATE_COMPLETE_CLEANUP_IN_PROGRESS - AWS::CloudFormation::Stack - CoindataxDashDashbirdDashDemo-dev
CloudFormation - UPDATE_COMPLETE - AWS::CloudFormation::Stack - CoindataxDashDashbirdDashDemo-dev
Serverless: Stack update finished...
Service Information
service: CoindataxDashDashbirdDashDemo
stage: dev
region: us-east-1
api keys:
None
endpoints:
None
functions:
coinmarketcap: CoindataxDashDashbirdDashDemo-dev-coinmarketcap
Stack Outputs
CoinmarketcapLambdaFunctionQualifiedArn: arn:aws:lambda:us-east-1:123456789012:function:CoindataxDashDashbirdDashDemo-dev-coinmarketcap:1
ServerlessDeploymentBucketName: coindataxdashdashbirddashdemo-dev-serverlessdeploymentbucket-abcdefgh1234
After you have successfully deployed your Lambda function, you quickly realize that AWS CloudWatch does not offer that many monitoring features for you to be in full control of your application. Dashbird tries to fill that gap, with a dashboard that groups all your lambdas in a single place, a live tailing of your application logs and more.
What I love the most about Dashbird is that it was super easy to set it up, and at the same time, it provided very useful insights to our team. I literally spent less than 5 minutes configuring it, and we were able to have a much better understanding of our architecture immediately.
When it comes to Lambda functions, Dashbird provided us an aggregated performance and resource usage metrics for easy analysis, such as:
Because of Dashbird, we noticed that all our lambdas were running with a third of the allocated memory size and that we could confidently reduce that threshold in order to reduce costs. The change was very simple to implement, as all we did was to update the memorySize
default parameter of serverless:
provider:
name: aws
runtime: nodejs6.10
memorySize: 512 # default is 1024
With one line of code, we reduced our billing by 50%, and we are constantly monitoring our application to see if we can reduce it even further.
Serverless architecture has undoubtedly many advantages to both small and big companies. Nevertheless, you should always take into account the unexpected debugging/monitoring time that you wouldn't have with traditional server-based systems. By using Dashbird, some of that hurdle can be reduced and maybe even eliminated, so that you end up with only the benefits of serverless and AWS Lambda.