In this article we will be building the stack of resources needed to run the application we prepared and containerized in the first part:
Our backend (app.py) is a Flask application that simulates an expensive computation and returns it in a formatted…hackernoon.com
The application source and CloudFormation stack file can be found here:
Independently Scalable Multi-Container Microservices Architecture on AWS Fargate - docwhite/ecsfsgithub.com
The ideas and requirements for this application, ecsfs, are:
- Backend and frontend services not to be public-facing.
- Nginx will sit on the front, publicly accessible.
- Nginx would proxy the request to the frontend.
- The frontend requests from the backend.
- Auto-scalable backend (since it performs expensive computations).
In order to address (1) and (2) we will require private and public subnets. To forward all traffic to the nginx server (2) we will set up an Application Load Balancer (ALB). For the communication between components (3) (4) we will make use of service discovery.
As we will be using ECS (Elastic Container Service) all the Fargate services will need to pull images from Docker Hub, that will involve outgoing traffic from the private network containers (1) to be routed to a NAT —otherwise the Internet is not reachable.
Let us have a look at the complete diagram:
I broke down this diagram and explained each piece separately following this structure: VPC and subnets, networking and routes, security groups, how to configure the load balancer, defining our services using ECS Fargate, setting up the auto-scaling and finally stressing our application to see the scaling happen.
You can skip sections if you are already familiarised with them.
How to deploy CloudFormation stacks
Either using command line or from the web console.
Using Command Line
First you will need to install and configure the AWS CLI. You can follow the docs:
TL;DR Basically you will need to install it with pip:
pip install awscli --upgrade --user
And configure it (specify your default region too so you don’t have to type it on each subsequent command):
To actually deploy the stack you have two choices (a) and (b)…
a) If you have no hosted zones set up (associated with a Route 53 domain):
aws cloudformation create-stack \
--stack-name ecsfs \
--template-body file://$(pwd)/stack.yaml \
--capabilities CAPABILITY_NAMED_IAM \
b) If you have a hosted zone, you can pass it in an the application will be available under the subdomain ecsfs.<your-hosted-zone-name>, e.g. ecsfs.example.com. Simply pass the parameter flag instead of leaving it empty:
(!) the trailing dot is needed ^
Using AWS Web Console
Log in into your account and then visit the CloudFormation section. Then:
- Click the Create Stack button.
- Click on Choose File and upload the stack.yaml file.
- Give the Stack a name: ecsfs.
- In the parameters section, you will see HostedZoneName. It is up to you if you want to use one of your hosted zones (domains) for instance foo.com. — do not forget the trailing dot — so the application would then be configured to run on a subdomain of it — like ecsfs.foo.com. You can leave it empty.
- Click Next.
- Click Next one more time.
- On the Capabilities section check the box I acknowledge that…
Deleting all the resources that have been created
Either from the web console or from CLI. To do it from the web console go to the CloudFormation section and delete it there. The command line equivalent is:
aws cloudformation delete-stack – stack-name ecsfs
Stack Parameters, Conditions and Outputs
In the stack file apart from resources we define parameters, conditions and outputs.
Parameters are variables the user can provide when deploying the stack that can be accessed within the resource definitions.
Conditions allows to define boolean variables that we can use to conditionally build some resources. You can set conditions for resource building by adding the Conditions yaml property under the resource.
In our case outputs let us have a quick way to expose attributes or any other information in the stack so we can for instance get handles to some resources we constructed (such as the DNS name for our application load balancer).
Virtual Private Cloud (VPC)
A VPC is simply a logically isolated chunk of the AWS Cloud.
Our VPC has two public subnetworks since it’s a requirement for an Application Load Balancer. The nginx container will use them too.
Then we will isolate backend and frontend to a private subnet so they can’t be reached directly from the Internet.
You will the word CIDR in various places, it is used for subnet masking.
CIDR blocks describe the Network ID and IP ranges to assign in our subnets. It basically tells what part of the address is reserved for IPs and what part is for the network ID.
E.g. 10.0.0.0/24 would mean that the first 3 octets (3 x 8 = 24) are going to be exclusively defining the Network ID, which would result in all the IPs that are given out would start with 10.0.0.x.
This video explains it very well: IPv4 Addressing: Network IDs and Subnet Masks
Networking Setup: Routing and Subnetting
Let’s revisit the main elements that conform a subnet and how we are going to use them in our application.
Allows communication between the containers and the internet. All the outbound traffic goes through it. In AWS it must get attached to a VPC.
All requests from a instances running on the public subnet must be routed to the internet gateway. This is done by defining routes laid down on route tables.
Network Address Translation (NAT) Gateway
When an application is running on a private subnet it cannot talk to the outside world. The NAT Gateway remaps the IP address of the packets sent from the private instance assigning them a public IP so when the service the instance wants to talk you replies, the NAT can receive the information (since the NAT itself is public-facing and reachable from the Internet) and hand it back to the private instance.
An Elastic IP needs to be associated with each NAT Gateway we create.
The reason why we traffic private tasks’ traffic through a NAT is so tasks can pull the images from Docker Hub whilst keeping protection since connections cannot be initiated from the Internet, just outbound traffic will be allowed through the NAT.
Routes and Route Tables
Route tables gather together a set of routes. A route describes where do packets need to go based on rules. You can for instance send any packets with destination address starting with 10.0.4.x to a NAT while others with destination address 10.0.5.x to another NAT or internet gateway (I cannot find a proper example, I apologise). You can describe both in and outbound routes.
The way we associate a route table with a subnet is by using Subnet Route Table Association resources, pretty descriptive.
Security groups act as firewalls between inbound and outbound communications of the instances we run.
We need to create a security group shared by all containers running on Fargate and another one for allowing traffic between the load balancer and the containers.
The stack has one security group with two ingress (inbound traffic) rules:
- To allow traffic coming from the Application Load Balancer (PublicLoadBalancerSecurityGroup)
- To allow traffic between running containers (FargateContainerSecurityGroup)
The Application Load Balancer (ALB) is the single point of contact for clients (users). Its function is to relay the request to the right running task (think of a task as an instance for now).
In our case all requests on port 80 are forwarded to nginx task.
To configure a load balancer we need to specify a listener and a target group. The listener is described through rules, where you can specify different targets to route to based on port or URL. The target group is the set of resources that would receive the routed requests from the ALB.
This target group will be managed by Fargate and every time a new instance of nginx spins up then it will register it automatically on this group, so we don’t have to worry about adding instances to the target group at all.
Automatically distribute incoming traffic across multiple targets using an Application Load Balancer.docs.aws.amazon.com
If a hosted zone got specified when running this stack, we create a subdomain on that zone and route it to the load balancer. For instance, say example.com. is specified as HostedZoneName, then all the traffic going to ecsfs.example.com would go to the load balancer.
Elastic Container Service (ECS)
ECS is a container management system. It basically removes the headache of having to setup and provision another management infrastructure such as Kubernetes or similar.
You define your application in ECS through task definitions, they act as blueprints which describe what containers to use, ports to open, what launch type to use (EC2 instances or Fargate), and what memory and CPU requirements are needed.
Then you got the service, which is in responsible for taking those tasks definitions to generate and manage running processes from them in a cluster. Those running processes instantiated by the service are called tasks.
— A cluster is a grouping of resources: services, task definitions, etc…
— On a task definition…
- … you can describe one or more containers for it to run.
- … you specify desired CPU and memory needed to run that process.
— A service takes a task definition and instantiates it into running tasks.
— Task definitions and services are configured per-cluster.
— Tasks run in a cluster.
— Auto-scaling is configured on the service-level.
Amazon Elastic Container Service (Amazon ECS) is a highly scalable, fast, container management service that makes it…docs.aws.amazon.com
Write out logs from tasks within our cluster under the same group. There is one log stream per task running. An aggregated result can be viewed from the web console under the page for the service the task is part of.
We need to allow Fargate to perform specific actions on our behalf.
- ECS Task Execution Role: This role enables AWS Fargate to pull container images from Amazon ECR and to forward logs to Amazon CloudWatch Logs.
- ECS Auto Scaling Role: Role needed to perform the scaling operations on our behalf, that is, to change the desired count of running tasks on the services.
With IAM roles for Amazon ECS tasks, you can specify an IAM role that can be used by the containers in a task…docs.aws.amazon.com
If you still get confused by the various roles to use on ECS Fargate I recommend reading the voted answer on here:
I am trying to set up a ECS but so far I have encountered a few permission issue for which I have created some…serverfault.com
In our application, we want the backend to be reachable at ecsfs-backend.local, the frontend at ecsfs-frontend.local, etc… You can see the names are suffixed with .local. In AWS we can create a PrivateDnsService resource and add services to them, and that would produce the aforementioned names, that is,
By creating various DNS names under the same namespace, services that get assigned those names can talk between them, i.e. the frontend talking to a backend, or nginx to the frontend.
The IP addresses for each service task are dynamic, they change, and sometimes more than task might be running for the same service… so… how do we associate the DNS name with the right task? 🤔 Well we don’t! Fargate does it all for us.
There is a whole section on the documentation explaining it in detail:
Your Amazon ECS service can optionally be configured to use Amazon ECS Service Discovery. Service discovery uses AWS…docs.aws.amazon.com
The application load balancer routes the requests to the nginx service, therefore we need to wait for the ALB to be initialized before we can actually spin up the nginx service (DependsOn property).
We are just interested in scaling the backend. For scaling a service you need to define a Scalable Target, which is where you specify what service do you want to scale, and a Scaling Policy, where you describe how and when do you want to scale it.
There’s two modes when scaling a service, we use Target Tracking Scaling, in which you specify a target value for a metric (say for instance 75% of CPU usage) and then Fargate would spin more instances when the average of all the tasks running that service exceed the threshold.
In our case we will scale the backend between 1 and 3 instances and we will specify a target CPU usage percentage of 50%.
Usually each service task spits out metrics every 1 minute. You can see these metrics on the CloudWatch page on the AWS web console. Use that for inspecting how Fargate reacts to changes when you stress the application.
Stressing Our Application
You can use the
ab unix command (Apache Benchmark) to send many requests to you application load balancer and see how Fargate starts scaling up the backend service.
First go to the web console under the EC2 page and look for the Load Balancers category.
In there look for the DNS name. You can also click the Outputs tab from the CloudFormation stack to see that URL. It should look like:
Then run the following command to stress the application. It will perform 10,000 requests (1 per second) printing all the responses.
ab -n 10000 -c 1 -v 3 http://<application_load_balancer_dns_name>/
I noticed that CloudWatch will wait until it has 3 consecutive measurements (metric points) exceeding the target value specified (50%). It is then when it then when it sets an alarm and Fargate reacts by adding extra running tasks until the metrics stabilize.
If then the CPU decreases and doesn’t need many tasks running anymore it would wait some minutes (around 15 metric points, that is, 15 min) to start scaling down.
What I expected readers to take out from this guide is not how to deploy services on Fargate but as a one more way to deploy them.
I also wrote that article for myself to come back at it if have to reproduce a similar structure, so I use this as some sort of skeleton or boilerplate. I wanted to share it with you in case it might serve the same purpose aforementioned.
I can think about three questions you might have after seeing this architecture case, let me answer them beforehand just in case 🙂.
Is the NAT really necessary?
For this particular case it is not, since if you stored your docker images in the Amazon ECR instead of Docker Hub you would not have the need to provide outbound interent access to the private tasks.
But if say for instance the frontend would perform requests to external APIs out there (like polling tweets from Twitter) then you would need the NAT.
NAT provides outbound traffic to your tasks retaining the isolation from the outside world so no one out there can initiate connections with them.
Why would you place the frontend on a private network?
It would make sense to have it in the public side, yes. I guess I would place it in a private subnet for two reasons:
- For security. So nobody messes with them directly.
- IPs are limited resources. In the future if I scaled frontend tasks to very high counts, each running task would have an IP if they sit on the public side.
Why bother with nginx, wouldn’t the load balancer do the job?
I guess in that case yes. I did it because it’s common in architectures and because nginx offers more features rather than just forwarding requests.
I would like to link to Nathan Peck’s articles and specifically point you to the one that helped me get started as it’s very well written and informative. I built my project following his skeleton and expanded it for my particular case.