paint-brush
WTF is AWS Traffic Mirroring? by@elan-srinivasan
1,456 reads
1,456 reads

WTF is AWS Traffic Mirroring?

by Elankumaran SrinivasanMarch 16th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

In this section I will explain the different ways to replicate traffic in AWS along with the pros & cons of each approach. Using AWS VPC Mirroring is the best approach for replicating traffic for applications deployed on AWS. The benefits of replication goes beyond just monitoring and troubleshooting issues. The information can be used in performing data analysis, analyze traffic pattern and in detecting security vulnerabilities and attacks on the system. The best approach is to use GoReplay, which can be installed on an EC2 instance.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - WTF is AWS Traffic Mirroring?
Elankumaran Srinivasan HackerNoon profile picture

What Is Traffic Mirroring (aka Replication) ?

Traffic replication is a process of making copies of incoming traffic. This could be all traffic coming into a particular host or only traffic coming to a particular port(s) on a host.

Fig 1: Traffic Replication

Why Replicate Traffic ?

For any web application it's now a norm to have separate environments for Development, Testing/Staging/Pre-Production and Production. The utilization of the application is each environment is varied and the traffic pattern on each is different.

Fig 2: Application Environments

Fig 3: Environment Utilization

Even after all the performance and automated tests it's still possible for new bugs to pop-up in production environment.

Common causes for the bugs :

  1. Unaccounted use cases during development.
  2. Invalid or new data in the incoming request.
  3. A certain traffic pattern or sequence of flows causing an issue.

Now, what if we had an environment (say pre-production) to which the replicated production traffic can be sent to. This environment would receive a wide variety of requests and also follow a traffic pattern similar to production. This makes it easier to validate any new code before pushing to production and also to troubleshoot any issues that pop up only in production environment and hard to reproduce in non-prod environments.

Fig 4: Replicate Production Traffic

How To Mirror(Replicate) Traffic In AWS ?

In this section I will explain the different ways to replicate traffic in AWS along with the pros & cons of each approach.

Approach #1 - Traffic Replication Using AWS VPC Mirroring

AWS has a VPC mirroring service which allows one to replicate traffic coming into an ENI (elastic networking interface) to another ENI or to a Load Balancer.

Fig 5: Architecture: VPC Traffic Mirroring

The source for traffic mirroring is an ENI. Even load balancers are backed by multiple (based on incoming scale of traffic) ENI's and it's possible to replicate traffic directly from the Load Balancers instead of hooking in with the target host ENI's.

AWS takes the source traffic packet and wraps that as a VXLAN packet. The original packet is placed in the payload of the VXLAN packet. The target receiving the replicated packet has to unwrap the VXLAN packet and extract the original payload for processing.

Fig 6: VXLAN unwrap & Replay

GoLang Script to Process Replicated Traffic:
Download Script : https://github.com/elang2/AWS-Traffic-Replication/blob/master/aws-traffic-replication.go

Fig 7: GoLang Script - Part 1

Fig 8: GoLang Script - Part 2

Fig 9: GoLang Script - Part 3

Pros:

  • Non-intrusive way for replication traffic in an AWS environment. No need to install additional tools or packages on any host/EC2 instance.
  • Scalable
  • Live, low latency traffic replication

Approach #2 - Traffic Replication VIA A Third Party Tool

GoReplay is an excellent tool to replicate traffic. The tool can be installed on an EC2 instance and can be configured to replicate traffic to any target.

Pros:

  • Simple, easy and quick to setup and configure.
  • Provides ability to filter traffic, rewrite request and replay as needed.

Cons:

  • Intrusive Approach. GoReplay needs to be installed and configured in EC2 instances that would be the source for replication.
  • GoReplay will compete for CPU and Memory on the host machines, which indirectly may affect the application performance.

Conclusion

Traffic replication is an effective way of maintaining a near production like environment with minimal effort. The benefits of replication goes beyond just monitoring and troubleshooting issues. The replicated information can be used in performing data analysis, analyze traffic pattern and in detecting security vulnerabilities and attacks on the system.

In my opinion and based on personal experience I find the AWS VPC mirroring to be the best approach for replicating traffic for applications deployed on AWS.