Building a Serverless Intrusion Detection System on AWS

If you have ever dealt with traditional Intrusion detection systems (IDS), you know the struggle: Heavy appliances, constant rule tuning, as well as the inability to stop false positives. And then the bills kept coming—it got really expensive even when traffic went down. I’ve experienced this before and it was frustrating. The good news? We don’t have to use IDS those ways anymore. By leveraging AWS’s serverless stack and a dash of machine learning, you can create a lightweight Intrusion detection systems that scale up when you need it, cost next to nothing when idle, and actually learn when to flag suspicious behavior.

Does this sound too good to be true? Let’s breakdown how to approach this step-by-step.

Why Go Serverless for Intrusion Detection?

Before we give the high-level architecture, let’s answer a simple question: why bother going serverless at all?

Elastic by design – No servers to patch, no EC2 instances to resize. If an event comes in, AWS Lambda just scales with traffic.
Pay-per-use – You only pay when an event is processed. Idle time costs basically-zero.
AI-powered – ML models can identify anomalies that static IDS signatures may miss.
Easy to integrate – AWS provides many sources of event data such as Cognito, IAM, and CloudTrail; or log data from an app you developed.

Think about this: Instead of maintaining a bulky IDS hardware appliance, you’re letting AWS take care of the scaling while you focus your efforts on developing, and refining, the intelligence.

Architecture Overview

The following is the high-level flow of a serverless IDS (Intrusion Detection System).

Event Source – Authentication logs (either from Cognito, IAM, or custom apps) are sent to EventBridge or dropped into an S3 bucket.
Lambda Function – The function picks up events, extracts the useful features (login time, geo, device fingerprint), and calls a SageMaker ML endpoint.
SageMaker Model – The model scores the event “legit” vs. “suspicious” based on the features that it sees.
DynamoDB Table – Holds the scored events with the risk detail.
Alerting Layer – If an event is deemed suspicious, it generates an SNS notification or an integration with AWS Security Hub.

ASCII flow for the visual thinkers:

Auth Event → EventBridge → Lambda → SageMaker (score) → DynamoDB (log + risk) → SNS / Security Hub (alert)

It looks simple but what's happening in these components is where the magic occurs.

Developing the Anomaly Detection Model

The core of this IDS is the ML model. A natural choice for a starting point—and of all models this is specifically built for anomaly detection—is Random Cut Forest (RCF).

For example, you may detect unusual login times (such as 3 AM logins from an unfamiliar device located within a foreign country)—RCF will perform well in this case.

This is how you could setup your model in SageMaker:

import boto3, sagemaker 
from sagemaker import RandomCutForest
session = sagemaker.Session()
role = ""
rcf = RandomCutForest(    
    role=role,    
    instance_count=1,    
    instance_type="ml.m5.large",    
    num_samples_per_tree=512,    
    num_trees=50
)

Training data should be in CSV format, and should include your features [hour, geo, user_hash, device_hash].

train_data = "s3://my-bucket/auth-train.csv"
rcf.fit(s3_input(train_data))
predictor = rcf.deploy(    
  initial_instance_count=1,    
  instance_type="ml.m5.large",    
  endpoint_name="ids-anomaly-detector"
)

The model returns an anomaly score. The consequence is a higher score shows the login appears increasingly "weird".

Handling Events Using Lambda

Think of the Lambda function acting as the connector for the whole process. It takes the event in its raw form; it will take the pieces of information associated with the event and send those to your ML model for scoring and then record the score to DynamoDB. And it also has the capability to send an alert, if the score is high enough.

This is a simplified version of a handler that I would start with:

import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";
import { SageMakerRuntimeClient, InvokeEndpointCommand } from "@aws-sdk/client-sagemaker-runtime";
const ddb = new DynamoDBClient();
const sm = new SageMakerRuntimeClient();
const TABLE = process.env.TABLE;
const ENDPOINT = process.env.ENDPOINT;
export const handler = async (event) => {  
  // Depending on the source of the log, it might have come wrapped in `detail`  
  const record = event.detail || event;  
  // Create features for the model, in this case, hour of login, geo hash, and device hash  
  const features = [record.hour, record.geoHash, record.deviceHash];  
  // Get the score of the event with SageMaker   
  const resp = await sm.send(    
      new InvokeEndpointCommand({      
            EndpointName: ENDPOINT,      
            Body: JSON.stringify(features),      
            ContentType: "application/json",    
          })  
      );  
  const { score } = JSON.parse(new TextDecoder().decode(resp.Body));  
  const risk = score > 3 ? "suspicious" : "legit";  
  // Write the results to DynamoDB  
  await ddb.send(    
      new PutItemCommand({      
        TableName: TABLE,      
        Item: {        
            eventId: { S: record.eventId },        
            userId: { S: record.userId },        
            score: { N: score.toString() },        
            risk: { S: risk },        
            ts: { S: new Date().toISOString() },      
          },    
        })  
    );  
  // If something seems suspicious, log it (or send it to SNS / Security Hub)  
  if (risk === "suspicious") {    
    console.warn("⚠️ Suspicious activity detected:", record);  
  }  
  return { status: "ok", risk };
};

Look at the steps—nice and clean right?extract → score → store → alert!

DynamoDB Schema Design

You should not have to be fancy with your DynamoDB table. A minimal schema is all that is needed:

PK: eventId
Attributes: userId, score, risk, timestamp
Optional: GSI on userId if you need to access login history quickly.

Sometimes simple is all you need.

Lessons Learned (The Hard Way)

Cold Starts vs Latency

You will be just fine dealing with cold starts in lambda for most use cases.
If you need ultra-fast latency latency, consider autoscaling and/or deploy compiled models (with neo) inside of lambda.

Feature Engineering is King

Basic simple features like login hour, geo delta, or device fingerprinting will give incredible accuracy boosts.
Do NOT just throw the raw logs into the model—curate features as it relates to your use case.

False Positives Hurt Trust

Set your threshold carefully.
Do NOT auto-block suspicious events. Report them to an analyst instead.

Cost Efficiency

DynamoDB on-demand + lambda is ridiculously cheap.
If you have sudden bursts of traffic, using brace complete inference with SageMaker async inference will save a ton of money.

Security Hygiene

Lock down lambdas IAM role strictly (white list it). Similar to above; only allow it the capability to invoke SageMaker and write to DynamoDB.

No wildcards in IAM policies, ever.

Frequently Asked Questions

Question: Can I implement this without using SageMaker?
- Yes. If your model is not large, it can be packaged with Lambda. However, if your model is large, SageMaker provides the flexibility to train and auto-scale the model to match your traffic needs.
Question: If my application is not in AWS, can I still log the events to EventBridge?
- Yes, you can send logs into EventBridge via APIs or dump logs into S3. Moreover, our pipeline does not have to be 100% AWS-native.
Question: What is the cost of this approach?
- With moderate traffic:
  - Lambda—pennies per million invokes
  - DynamoDB—on-demand can be a few dollars
  - SageMaker endpoint—approximately $0.05/hr for very small instances
  That is very affordable compared to traditional IDS solutions.
Question: Won’t anomaly detection create too many false positives for normal traffic?
- It certainly can. This is the reason why thresholds and feedback loops are important. Add a regular analyst review of the alerts, and you can tune your model iteratively until appropriate thresholds are set.

Summary

So take-away summary: you can build a intrusion detection system with AWS Lambda, DynamoDB and SageMaker that is serverless, scalable and intelligent.

No server to babysit.
Costs are tied to usage.
ML implementation of the models can learn and adapt to new attack patterns.
Alerts will be native to your entire AWS infrastructure security ecosystem.

In all honesty, it feels good to depart from the old static IDS world, where we fought against false positives and spent our weekends applying patches to physical boxes.

Now, question for you to consider: if you had this set up in place, how would you extend the functionality? Would you add a new capability to support blocking in the flows of your system, or monitor for a while before taking the active blocking step?

That is the nice part about building it—when you build it, you choose the level of balance between a fully automated process and one that actively seeks a human to be appraised of what is happening.