paint-brush
Serverless App: AWS CloudTrail Log Analytics using Amazon Elasticsearch Serviceby@kuldeep
3,992 reads
3,992 reads

Serverless App: AWS CloudTrail Log Analytics using Amazon Elasticsearch Service

by Kuldeep SinghFebruary 9th, 2018
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In this article, I’m will talk about how you can build a <a href="https://hackernoon.com/tagged/serverless" target="_blank">Serverless</a> application using <a href="https://github.com/awslabs/serverless-application-model" target="_blank">AWS Serverless Application Model</a> (SAM) to perform Log Analytics on AWS <a href="https://aws.amazon.com/cloudtrail/" target="_blank">CloudTrail</a> data using <a href="https://aws.amazon.com/elasticsearch-service/" target="_blank">Amazon Elasticsearch Service</a>. The <a href="https://hackernoon.com/tagged/aws" target="_blank">AWS</a> Serverless Application will help you analyze <a href="https://hackernoon.com/tagged/aws" target="_blank">AWS</a> CloudTrail Logs using Amazon Elasticsearch Service. The application creates CloudTrail trail, sets the log delivery to an s3 bucket that it creates and configures SNS delivery whenever the CloudTrail log file has been written to s3. The app also<br>creates an Amazon Elasticsearch Domain and creates an Amazon Lambda Function which gets triggered by the SNS message, get the s3 file location, read the contents from the s3 file and write the data to Elasticsearch for analytics.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Serverless App: AWS CloudTrail Log Analytics using Amazon Elasticsearch Service
Kuldeep Singh HackerNoon profile picture


In this article, I’m will talk about how you can build a Serverless application using AWS Serverless Application Model (SAM) to perform Log Analytics on AWS CloudTrail data using Amazon Elasticsearch Service. The AWS Serverless Application will help you analyze AWS CloudTrail Logs using Amazon Elasticsearch Service. The application creates CloudTrail trail, sets the log delivery to an s3 bucket that it creates and configures SNS delivery whenever the CloudTrail log file has been written to s3. The app alsocreates an Amazon Elasticsearch Domain and creates an Amazon Lambda Function which gets triggered by the SNS message, get the s3 file location, read the contents from the s3 file and write the data to Elasticsearch for analytics.

Let’s learn about what is AWS CloudTrail, Elasticsearch, Amazon Elasticsearch Service, AWS Lambda and AWS SAM.

What is AWS CloudTrail?

AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. With CloudTrail, you can log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. CloudTrail provides event history of your AWS account activity, including actions taken through the AWS Management Console, AWS SDKs, command line tools, and other AWS services. This event history simplifies security analysis, resource change tracking, and troubleshooting.


AWS CloudTrail_AWS CloudTrail allows you track and automatically respond to account activity threatening the security of your AWS…_aws.amazon.com

What is Elasticsearch?

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.


Elasticsearch: RESTful, Distributed Search & Analytics | Elastic_Distributed, open source search and analytics engine designed for horizontal scalability, reliability, and easy…_www.elastic.co

What is Amazon Elasticsearch Service?

Amazon Elasticsearch Service makes it easy to deploy, secure, operate, and scale Elasticsearch for log analytics, full text search, application monitoring, and more. Amazon Elasticsearch Service is a fully managed service that delivers Elasticsearch’s easy-to-use APIs and real-time analytics capabilities alongside the availability, scalability, and security that production workloads require.


Amazon Elasticsearch Service — Amazon Web Services (AWS)_Use Amazon Elasticsearch Service to easily deploy, operate and scale Elasticsearch on AWS. Launch your Amazon…_aws.amazon.com

What is AWS Lambda?

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume — there is no charge when your code is not running. With Lambda, you can run code for virtually any type of application or backend service — all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.


AWS Lambda — Serverless Compute — Amazon Web Services_AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume._aws.amazon.com

What is AWS Serverless Application Model?

AWS Serverless Application Model (AWS SAM) prescribes rules for expressing Serverless applications on AWS. The goal of AWS SAM is to define a standard application model for Serverless applications.


awslabs/serverless-application-model_serverless-application-model — AWS Serverless Application Model (AWS SAM) prescribes rules for expressing Serverless…_github.com

Now let’s look at how we can build a Serverless App to perform Log Analytics on AWS CloudTrail data using Amazon Elasticsearch Service.

This is the architecture of the CloudTrail Log Analytics Serverless Application:

Architecture for Serverless Application: CloudTrail Log Analytics using Elasticsearch

AWS Serverless Application Model is a AWS Cloudformation template. Before we look at the code for SAM template, let’s work on packaging our AWS Lambda.

On your workstation, create a working folder for building the Serverless Application.

Create a file called index.py for the AWS Lambda:


""" This module reads the SNS message to get the S3 file location for cloudtraillog and stores into Elasticsearch. """









from __future__ import print_functionimport jsonimport boto3import loggingimport datetimeimport gzipimport urllibimport osimport traceback


from StringIO import StringIOfrom exceptions import *



# from awses.connection import AWSConnectionfrom elasticsearch import Elasticsearch, RequestsHttpConnectionfrom requests_aws4auth import AWS4Auth


logger = logging.getLogger()logger.setLevel(logging.INFO)

s3 = boto3.client('s3', region_name=os.environ['AWS_REGION'])








awsauth = AWS4Auth(os.environ['AWS_ACCESS_KEY_ID'], os.environ['AWS_SECRET_ACCESS_KEY'], os.environ['AWS_REGION'], 'es', session_token=os.environ['AWS_SESSION_TOKEN'])es = Elasticsearch(hosts=[{'host': os.environ['es_host'], 'port': 443}],http_auth=awsauth,use_ssl=True,verify_certs=True,connection_class=RequestsHttpConnection)


def handler(event, context):logger.info('Event: ' + json.dumps(event, indent=2))

s3Bucket = json.loads(event\['Records'\]\[0\]\['Sns'\]\['Message'\])\['s3Bucket'\].encode('utf8')  
s3ObjectKey = urllib.unquote\_plus(json.loads(event\['Records'\]\[0\]\['Sns'\]\['Message'\])\['s3ObjectKey'\]\[0\].encode('utf8'))

logger.info('S3 Bucket: ' + s3Bucket)  
logger.info('S3 Object Key: ' + s3ObjectKey)

try:  
    response = s3.get\_object(Bucket=s3Bucket, Key=s3ObjectKey)  
    content = gzip.GzipFile(fileobj=StringIO(response\['Body'\].read())).read()

    for record in json.loads(content)\['Records'\]:  
        recordJson = json.dumps(record)  
        logger.info(recordJson)  
        indexName = 'ct-' + datetime.datetime.now().strftime("%Y-%m-%d")  
        res = es.index(index=indexName, doc\_type='record', id=record\['eventID'\], body=recordJson)  
        logger.info(res)  
    return True  
except Exception as e:  
    logger.error('Something went wrong: ' + str(e))  
    traceback.print\_exc()  
    return False

Create a file called requirements for the python packages that are needed:


elasticsearch>=5.0.0,<6.0.0requests-aws4auth

With the above requirements file created in your workspace, run the below command to install the required packages:

python -m pip install -r requirements.txt -t ./

Create a file called template.yaml that will store the code for AWS SAM:



AWSTemplateFormatVersion: '2010-09-09'Transform: 'AWS::Serverless-2016-10-31'Description: >

This SAM example creates the following resources:

  S3 Bucket: S3 Bucket to hold the CloudTrail Logs  
  CloudTrail: Create CloudTrail trail for all regions and configures it to delivery logs to the above S3 Bucket  
  SNS Topic: Configure SNS topic to receive notifications when the CloudTrail log file is created in s3  
  Elasticsearch Domain: Create Elasticsearch Domain to hold the CloudTrail logs for advanced analytics  
  IAM Role: Create IAM Role for Lambda Execution and assigns Read Only S3 permission  
  Lambda Function:  Create Function which get's triggered when SNS receives notification, reads the contents from s3 and stores them in Elasticsearch Domain  

Outputs:

S3Bucket:  
  Description: "S3 Bucket Name where CloudTrail Logs are delivered"  
  Value: !Ref S3Bucket  
LambdaFunction:  
  Description: "Lambda Function that reads CloudTrail logs and stores them into Elasticsearch Domain"  
  Value: !GetAtt Function.Arn  
ElasticsearchUrl:  
  Description: "Elasticsearch Domain Endpoint that you can use to access the CloudTrail logs and analyze them"  
  Value: !GetAtt ElasticsearchDomain.DomainEndpoint  











































































































































Resources:SNSTopic:Type: AWS::SNS::TopicSNSTopicPolicy:Type: "AWS::SNS::TopicPolicy"Properties:Topics:- Ref: "SNSTopic"PolicyDocument:Version: "2008-10-17"Statement:-Sid: "AWSCloudTrailSNSPolicy"Effect: "Allow"Principal:Service: "cloudtrail.amazonaws.com"Resource: "*"Action: "SNS:Publish"S3Bucket:Type: AWS::S3::BucketS3BucketPolicy:Type: "AWS::S3::BucketPolicy"Properties:Bucket:Ref: S3BucketPolicyDocument:Version: "2012-10-17"Statement:-Sid: "AWSCloudTrailAclCheck"Effect: "Allow"Principal:Service: "cloudtrail.amazonaws.com"Action: "s3:GetBucketAcl"Resource:!Sub |-arn:aws:s3:::${S3Bucket}-Sid: "AWSCloudTrailWrite"Effect: "Allow"Principal:Service: "cloudtrail.amazonaws.com"Action: "s3:PutObject"Resource:!Sub |-arn:aws:s3:::${S3Bucket}/AWSLogs/${AWS::AccountId}/*Condition:StringEquals:s3:x-amz-acl: "bucket-owner-full-control"CloudTrail:Type: AWS::CloudTrail::TrailDependsOn:- SNSTopicPolicy- S3BucketPolicyProperties:S3BucketName:Ref: S3BucketSnsTopicName:Fn::GetAtt:- SNSTopic- TopicNameIsLogging: trueEnableLogFileValidation: trueIncludeGlobalServiceEvents: trueIsMultiRegionTrail: trueFunctionIAMRole:Type: "AWS::IAM::Role"Properties:Path: "/"ManagedPolicyArns:- "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"- "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"AssumeRolePolicyDocument:Version: "2012-10-17"Statement:-Sid: "AllowLambdaServiceToAssumeRole"Effect: "Allow"Action:- "sts:AssumeRole"Principal:Service:- "lambda.amazonaws.com"ElasticsearchDomain:Type: AWS::Elasticsearch::DomainDependsOn:- FunctionIAMRoleProperties:DomainName: "cloudtrail-log-analytics"ElasticsearchClusterConfig:InstanceCount: "2"EBSOptions:EBSEnabled: trueIops: 0VolumeSize: 20VolumeType: "gp2"AccessPolicies:Version: "2012-10-17"Statement:-Sid: "AllowFunctionIAMRoleESHTTPFullAccess"Effect: "Allow"Principal:AWS: !GetAtt FunctionIAMRole.ArnAction: "es:ESHttp*"Resource:!Sub |-arn:aws🇪🇸${AWS::Region}:${AWS::AccountId}:domain/cloudtrail-log-analytics/*-Sid: "AllowFullAccesstoKibanaForEveryone"Effect: "Allow"Principal:AWS: "*"Action: "es:*"Resource:!Sub |-arn:aws🇪🇸${AWS::Region}:${AWS::AccountId}:domain/cloudtrail-log-analytics/_plugin/kibanaElasticsearchVersion: "5.5"Function:Type: 'AWS::Serverless::Function'DependsOn:- ElasticsearchDomain- FunctionIAMRoleProperties:Handler: index.handlerRuntime: python2.7CodeUri: ./Role: !GetAtt FunctionIAMRole.ArnEvents:SNSEvent:Type: SNSProperties:Topic: !Ref SNSTopicEnvironment:Variables:es_host:Fn::GetAtt:- ElasticsearchDomain- DomainEndpoint

Packing Artifacts and uploading them to s3:

Run the following command to upload your artifacts to S3 and output a packaged template that can be readily deployed to CloudFormation.




aws cloudformation package \--template-file template.yaml \--s3-bucket bucket-name \--output-template-file serverless-output.yaml

Deploying this AWS SAM to AWS CloudFormation:

You can use aws cloudformation deploy CLI command to deploy the SAM template. Under-the-hood it creates and executes a changeset and waits until the deployment completes. It also prints debugging hints when the deployment fails. Run the following command to deploy the packaged template to a stack called cloudtrail-log-analytics:




aws cloudformation deploy \--template-file serverless-output.yaml \--stack-name cloudtrail-log-analytics \--capabilities CAPABILITY_IAM

Refer to the documentation for more details.

I recommend reading about Elasticsearch Service Access Policies using the documentation and modify the Access policy of the Elasticsearch domain to further fine tune the access policy.

Once the Serverless application is deployed in your AWS account, It will automatically store the AWS CloudTrail data into Amazon Elasticsearch Service as soon as the log is delivered to s3. With the data in Elasticsearch, you can use Kibana to visualize the data in Elasticsearch and create the dashboards that you need on the AWS CloudTrail data.

The above Serverless Application Model app is available at the below Github repo:


ExpediaDotCom/cloudtrail-log-analytics_cloudtrail-log-analytics — Cloudtrail Log Analytics using Amazon Elasticsearch Service — AWS Serverless Application_github.com