In this article we will be building the stack of resources needed to run the application we prepared and containerized in the first part: _Our backend (app.py) is a Flask application that simulates an expensive computation and returns it in a formatted…_hackernoon.com Independently Scalable Multi-Container Microservices Architecture on AWS Fargate (I) The application source and CloudFormation stack file can be found here: _Independently Scalable Multi-Container Microservices Architecture on AWS Fargate - docwhite/ecsfs_github.com docwhite/ecsfs Overview The ideas and requirements for this application, , are: ecsfs Backend and frontend services not to be public-facing. Nginx will sit on the front, publicly accessible. Nginx would proxy the request to the frontend. The frontend requests from the backend. Auto-scalable backend (since it performs expensive computations). In order to address (1) and (2) we will require private and public subnets. To forward all traffic to the nginx server (2) we will set up an . For the communication between components (3) (4) we will make use of . Application Load Balancer (ALB) service discovery As we will be using ECS (Elastic Container Service) all the Fargate services will need to pull images from Docker Hub, that will involve outgoing traffic from the private network containers (1) to be routed to a NAT —otherwise the Internet is not reachable. Let us have a look at the complete diagram: I broke down this diagram and explained each piece separately following this structure: VPC and subnets, networking and routes, security groups, how to configure the load balancer, defining our services using ECS Fargate, setting up the auto-scaling and finally stressing our application to see the scaling happen. You can skip sections if you are already familiarised with them. How to deploy CloudFormation stacks Either using command line or from the web console. Using Command Line First you will need to install and configure the AWS CLI. You can follow the docs: https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html Basically you will need to install it with pip: TL;DR pip install awscli --upgrade --user And configure it (specify your default region too so you don’t have to type it on each subsequent command): aws configure To actually deploy the stack you have two choices (a) and (b)… a) If you have no hosted zones set up (associated with a Route 53 domain): aws cloudformation create-stack \--stack-name ecsfs \--template-body file://$(pwd)/stack.yaml \--capabilities CAPABILITY_NAMED_IAM \--parameters ParameterKey=HostedZoneName,ParameterValue= b) If you have a hosted zone, you can pass it in an the application will be available under the subdomain , e.g. . Simply pass the parameter flag instead of leaving it empty: ecsfs.<your-hosted-zone-name> ecsfs.example.com --parameters ParameterKey=HostedZoneName,ParameterValue=foo.com.(!) the trailing dot is needed ^ Using AWS Web Console Log in into your account and then visit the section. Then: CloudFormation Click the button. Create Stack Click on and upload the file. Choose File stack.yaml Give the Stack a name: . ecsfs In the parameters section, you will see . It is up to you if you want to use one of your hosted zones (domains) for instance do not forget the trailing dot — so the application would then be configured to run on a subdomain of it — like . You can leave it empty. HostedZoneName foo.com. — ecsfs.foo.com Click . Next Click one more time. Next On the section check the box Capabilities I acknowledge that… Deleting all the resources that have been created Either from the web console or from CLI. To do it from the web console go to the section and delete it there. The command line equivalent is: CloudFormation aws cloudformation delete-stack – stack-name ecsfs Stack Parameters, Conditions and Outputs In the stack file apart from resources we define parameters, conditions and outputs. are variables the user can provide when deploying the stack that can be accessed within the resource definitions. Parameters allows to define boolean variables that we can use to conditionally build some resources. You can set conditions for resource building by adding the Conditions property under the resource. Conditions yaml In our case let us have a quick way to expose attributes or any other information in the stack so we can for instance get handles to some resources we constructed (such as the DNS name for our application load balancer). outputs Virtual Private Cloud (VPC) A VPC is simply a logically isolated chunk of the AWS Cloud. Our VPC has two public subnetworks since it’s a requirement for an Application Load Balancer. The nginx container will use them too. Then we will isolate backend and frontend to a private subnet so they can’t be reached directly from the Internet. You will the word CIDR in various places, it is used for subnet masking. CIDR blocks describe the Network ID and IP ranges to assign in our subnets. It basically tells what part of the address is reserved for IPs and what part is for the network ID. E.g. 10.0.0.0/24 would mean that the first 3 octets (3 x 8 = 24) are going to be exclusively defining the Network ID, which would result in all the IPs that are given out would start with 10.0.0.x. This video explains it very well: IPv4 Addressing: Network IDs and Subnet Masks Networking Setup: Routing and Subnetting Let’s revisit the main elements that conform a subnet and how we are going to use them in our application. Internet Gateway Allows communication between the containers and the internet. All the outbound traffic goes through it. In AWS it must get attached to a VPC. All requests from a instances running on the public subnet must be routed to the internet gateway. This is done by defining routes laid down on route tables. Network Address Translation (NAT) Gateway When an application is running on a private subnet it cannot talk to the outside world. The NAT Gateway remaps the IP address of the packets sent from the private instance assigning them a public IP so when the service the instance wants to talk you replies, the NAT can receive the information (since the NAT itself is public-facing and reachable from the Internet) and hand it back to the private instance. An Elastic IP needs to be associated with each NAT Gateway we create. The reason why we traffic private tasks’ traffic through a NAT is so tasks can pull the images from Docker Hub whilst keeping protection since connections cannot be initiated from the Internet, just outbound traffic will be allowed through the NAT. Routes and Route Tables Route tables gather together a set of routes. A route describes where do packets need to go based on rules. You can for instance send any packets with destination address starting with 10.0.4.x to a NAT while others with destination address 10.0.5.x to another NAT or internet gateway (I cannot find a proper example, I apologise). You can describe both in and outbound routes. The way we associate a route table with a subnet is by using resources, pretty descriptive. Subnet Route Table Association Security Security groups act as firewalls between inbound and outbound communications of the instances we run. We need to create a security group shared by all containers running on Fargate and another one for allowing traffic between the load balancer and the containers. The stack has one security group with two ingress (inbound traffic) rules: To allow traffic coming from the Application Load Balancer (PublicLoadBalancerSecurityGroup) To allow traffic between running containers (FargateContainerSecurityGroup) Load Balancer The Application Load Balancer (ALB) is the single point of contact for clients (users). Its function is to relay the request to the right running task (think of a task as an instance for now). In our case all requests on port 80 are forwarded to nginx task. To configure a load balancer we need to specify a and a . The listener is described through rules, where you can specify different targets to route to based on port or URL. The target group is the set of resources that would receive the routed requests from the ALB. listener target group This target group will be managed by Fargate and every time a new instance of nginx spins up then it will register it automatically on this group, so we don’t have to worry about adding instances to the target group at all. _Automatically distribute incoming traffic across multiple targets using an Application Load Balancer._docs.aws.amazon.com What Is an Application Load Balancer? - Elastic Load Balancing If a hosted zone got specified when running this stack, we create a subdomain on that zone and route it to the load balancer. For instance, say . is specified as , then all the traffic going to would go to the load balancer. example.com HostedZoneName ecsfs.example.com Elastic Container Service (ECS) ECS is a container management system. It basically removes the headache of having to setup and provision another management infrastructure such as Kubernetes or similar. You define your application in ECS through , they act as blueprints which describe what containers to use, ports to open, what launch type to use (EC2 instances or Fargate), and what memory and CPU requirements are needed. task definitions Then you got the , which is in responsible for taking those tasks definitions to generate and manage running processes from them in a . Those running processes instantiated by the service are called . service cluster tasks Summarised: — A cluster is a grouping of resources: services, task definitions, etc… — On a … task definition … you can describe one or more containers for it to run. … you specify desired CPU and memory needed to run that process. — A service takes a and instantiates it into running . task definition tasks and are configured per- . — Task definitions services cluster — run in a cluster. Tasks — Auto-scaling is configured on the service-level. _Amazon Elastic Container Service (Amazon ECS) is a highly scalable, fast, container management service that makes it…_docs.aws.amazon.com What is Amazon Elastic Container Service? - Amazon Elastic Container Service Cluster Logging Write out logs from tasks within our cluster under the same group. There is one log stream per task running. An aggregated result can be viewed from the web console under the page for the service the task is part of. IAM Roles We need to allow Fargate to perform specific actions on our behalf. : This role enables AWS Fargate to pull container images from Amazon ECR and to forward logs to Amazon CloudWatch Logs. ECS Task Execution Role : Role needed to perform the scaling operations on our behalf, that is, to change the desired count of running tasks on the services. ECS Auto Scaling Role _With IAM roles for Amazon ECS tasks, you can specify an IAM role that can be used by the containers in a task…_docs.aws.amazon.com IAM Roles for Tasks - Amazon Elastic Container Service If you still get confused by the various roles to use on ECS Fargate I recommend reading the voted answer on here: _I am trying to set up a ECS but so far I have encountered a few permission issue for which I have created some…_serverfault.com Confused by the role requirement of ECS Task Definitions Service Discovery In our application, we want the backend to be reachable at , the frontend at , etc… You can see the names are suffixed with In AWS we can create a resource and add services to them, and that would produce the aforementioned names, that is, . ecsfs-backend.local ecsfs-frontend.local .local. PrivateDnsService <service_name>.<private_dns_namespace> By creating various DNS names under the same namespace, services that get assigned those names can talk between them, i.e. the frontend talking to a backend, or nginx to the frontend. The addresses for each service task are dynamic, they change, and sometimes more than task might be running for the same service… so… how do we associate the DNS name with the right task? 🤔 Well we don’t! Fargate does it all for us. IP There is a whole section on the documentation explaining it in detail: _Your Amazon ECS service can optionally be configured to use Amazon ECS Service Discovery. Service discovery uses AWS…_docs.aws.amazon.com Service Discovery - Amazon Elastic Container Service Services The application load balancer routes the requests to the nginx service, therefore we need to wait for the ALB to be initialized before we can actually spin up the nginx service ( property). DependsOn Auto-Scaling We are just interested in scaling the backend. For scaling a service you need to define a , which is where you specify service do you want to scale, and a , where you describe and do you want to scale it. Scalable Target what Scaling Policy how when There’s two modes when scaling a service, we use , in which you specify a target value for a metric (say for instance 75% of CPU usage) and then Fargate would spin more instances when the average of all the tasks running that service exceed the threshold. Target Tracking Scaling In our case we will scale the backend between 1 and 3 instances and we will specify a target CPU usage percentage of 50%. Usually each service task spits out metrics every 1 minute. You can see these metrics on the CloudWatch page on the AWS web console. Use that for inspecting how Fargate reacts to changes when you stress the application. _With target tracking scaling policies, you select a metric and set a target value. Amazon ECS creates and manages the…_docs.aws.amazon.com Target Tracking Scaling Policies - Amazon Elastic Container Service Stressing Our Application You can use the unix command (Apache Benchmark) to send many requests to you application load balancer and see how Fargate starts scaling up the backend service. ab First go to the web console under the EC2 page and look for the Load Balancers category. In there look for the DNS name. You can also click the tab from the CloudFormation stack to see that URL. It should look like: Outputs http://ecsfs-loadb-1g27mx21p6h8d-1015414055.us-west-2.elb.amazonaws.com/ Then run the following command to stress the application. It will perform 10,000 requests (1 per second) printing all the responses. ab -n 10000 -c 1 -v 3 http://<application_load_balancer_dns_name>/ Fargate started 2 more backend tasks when I started to stress the application I noticed that CloudWatch will wait until it has 3 consecutive measurements (metric points) exceeding the target value specified (50%). It is then when it then when it sets an alarm and Fargate reacts by adding extra running tasks until the metrics stabilize. Backend CPU utilisation (%) over time. Started stressing at 19:05. Fargate started 2 more instances at 19:12 If then the CPU decreases and doesn’t need many tasks running anymore it would wait some minutes (around 15 metric points, that is, 15 min) to start scaling down. Wrapping Up What I expected readers to take out from this guide is not but as a . how to deploy services on Fargate one more way to deploy them I also wrote that article for myself to come back at it if have to reproduce a similar structure, so I use this as some sort of skeleton or boilerplate. I wanted to share it with you in case it might serve the same purpose aforementioned. I can think about three questions you might have after seeing this architecture case, let me answer them beforehand just in case 🙂. Is the NAT really necessary? For this particular case it is not, since if you stored your docker images in the Amazon ECR instead of Docker Hub you would not have the need to provide outbound interent access to the private tasks. But if say for instance the frontend would perform requests to external APIs out there (like polling tweets from Twitter) then you would need the NAT. NAT provides outbound traffic to your tasks retaining the isolation from the outside world so no one out there can initiate connections with them. Why would you place the frontend on a private network? It would make sense to have it in the public side, yes. I guess I would place it in a private subnet for two reasons: For security. So nobody messes with them directly. IPs are limited resources. In the future if I scaled frontend tasks to very high counts, each running task would have an IP if they sit on the public side. Why bother with nginx, wouldn’t the load balancer do the job? I guess in that case yes. I did it because it’s common in architectures and because nginx offers more features rather than just forwarding requests. Similar Resources I would like to link to and specifically point you to the one that helped me get started as it’s very well written and informative. I built my project following his skeleton and expanded it for my particular case. Nathan Peck’s articles _This article walks through the process of building a chat application, containerizing it, and deploying it using AWS…_medium.com Building a Socket.io chat app and deploying it using AWS Fargate