Amazon Elastic Container Service (ECS) Anywhere: A New ECS Function

Written by gokulchandrapr | Published 2021/06/03
Tech Story Tags: aws | aws-ecs | aws-containers | containers | amazon-web-services | hackernoon-top-story | aws-services | amazon

TLDR ECS Anywhere and EKS Anywhere are designed to let users run their services on premises, as well as in the cloud. Amazon ECS is a cloud-based, fully managed, highly scalable container orchestration service and control plane they use in AWS today. ECS Anywhere works with any compute unit as long as it runs the ECS agent and an AWS Systems Manager Agent (SSM Agent) The agent is deployed as a Docker container, and the agent uses the managed instance’s IAM role to connect to the Amazon control plane in an AWS Region. This keeps application processing closer to the data to maintain higher bandwidth and lower latencies.via the TL;DR App

ECS Anywhere and EKS Anywhere, both designed to let users run their services on premises, as well as in the cloud, were announced in AWS re:Invent 2020.
ECS Anywhere is now generally available and EKS Anywhere will debut sometime this year. The Anywhere feature, allows ECS and EKS customers to extend to those non-AWS environments using the same cloud-native ways of working, tooling and managed services that they are comfortable with. As AWS and other cloud service providers extend their reach into on-premises IT environments, the line between public and private cloud computing environments will continue to blur.
Amazon ECS Anywhere provides customers the ability to run Amazon ECS on any infrastructure using the same cloud-based, fully managed, highly scalable container orchestration service and control plane they use in AWS today.
AWS recently announced full support for Amazon ECS on AWS Outposts (a service that extends AWS infrastructure, services, APIs, and tools to customers’ premises using AWS owned and fully managed hardware), AWS Wavelength and AWS Local Zones (accommodates specific latency and network connectivity.
Apart from these, there are several reasons why users might want to use ECS as a control plane for the container-based applications but not run the actual applications within the AWS infrastructure.
These include: latency requirements (running applications on the same network as other entities to give the low latency required), regulatory requirements (legal, security, or privacy needs that can only be satisfied by running applications in customers own data center), operational requirements (restrain from managing complex container clusters) and the other significant aspect is to utilize significant capital investments in their data centers.
ECS Anywhere delivers the same operational models for on-prem and the cloud. This keeps application processing closer to the data to maintain higher bandwidth and lower latencies, adheres to compliance regulations that are not yet approved to run on cloud managed services, and allows data center capital investments to be fully amortized before moving to the cloud. With Amazon ECS Anywhere, customers no longer need to run, update, or maintain their own container orchestrators on-premises.
The important pieces which make up ECS Anywhere are the Amazon SSM Agent, the Amazon ECS init code and the Amazon ECS Agent. The ECS agent itself runs in a container that gets pulled from an S3 bucket in AWS. The container is built by AWS. ECS Anywhere works with any compute unit as long as it runs the ECS agent and an AWS Systems Manager Agent (SSM Agent)

amazon-ssm-agent

The first component is amazon-ssm-agent that is connected to AWS Systems Manager, this is deployed as a systemd service on the host operating system. AWS Systems Manager Agent (SSM Agent) is Amazon software that can be installed and configured on an EC2 instance, an on-premises server, or a virtual machine (VM).
SSM Agent makes it possible for Systems Manager to update, manage, and configure the resources. The agent processes requests from the Systems Manager service in the AWS Cloud, and then runs them as specified in the request. SSM Agent then sends status and execution information back to the Systems Manager service by using the Amazon Message Delivery Service (service prefix: ec2messages).
Systems manager is already a standalone service, this service plays a major role in ECS Anywhere providing secure sessions (terminal access) and fleet management capabilities for external/managed instances.

amazon-ecs-agent

The second component is amazon-ecs-agent which is responsible for formulating and provisioning tasks provided by users as ECS task definitions from AWS ECS service. This is deployed as a Docker container, and the agent uses the managed instance’s IAM role to connect to the Amazon ECS control plane in an AWS Region.
SSM agent is supported on multiple operating systems and the ECS agent can run on any platform which can run Docker engine which includes small-footprint SBC's such as Raspberry Pi. In disconnected scenarios, ECS Anywhere tasks continue running on customer managed infrastructure. Cloud connectivity is required to update or scale the tasks, or to connect to other in-region AWS services at runtime.
Setting up a ECS Anywhere cluster is fairly simple. One of the advantage with ECS is its simple managed control plane. The service helps users run containerized apps without having to dive into complex setup and management features necessary with other solutions.
1. An admin user requests an activation key from ECS Console or using aws sdk (aws ssm create-activation –iam-role $ROLE_NAME | tee ssm-activation.json), this secret activation code allows the agent to register itself with AWS Systems Manager.
2. The admin user registers the on-premises node/VM using the activation key (amazon-ssm-agent -register -code “$CODE” -id “$ID” -region “AWS_DEFAULT_REGION”).
3. The SSM agent uses the activation code to register the hardware device as a managed instance and download a secret key for that managed instance. From that point on, the managed instance can be assigned an AWS Identity and Access Management (IAM) role and will automatically receive IAM credentials for that role. This role is essential because it allows the instance to make all the other required communications to other AWS services like Amazon ECS.
4. Authorized users can now manage the registered external instance using AWS Systems Manager. An external instance can be a VM/Server/Single Board Computers like Raspberry PI in customer managed private space.
5. ecs-agent uses the managed instance’s IAM role to connect to the Amazon ECS control plane in an AWS Region.
6. Authorized users can now submit task definitions similar to traditional ECS and the same will be deployed on the external instances in on-premises data center.
7. The ecs-agent can also submit task telemetry to the control plane about the lifecycle of the containers and their health. Users can monitor the remote workloads from CloudWatch console.

Amazon ECS Anywhere Pricing

Users pay $0.01025 per hour for each managed ECS Anywhere on-premises instance. An on-premises instance is a customer-managed instance that has been registered with an Amazon ECS cluster and runs the Amazon ECS container agent.

Creating a ECS Cluster (Control Plane)

A networking-only template can be used to create a cluster which creates an empty cluster. This cluster template is typically used for workloads hosted on either AWS Fargate or external instances (ECS Anywhere) where the networking on the fargate is isolated and managed by AWS and on the other hand the networking aspect is taken care by the user for external/managed instances.
Optionally, users can opt to create a new VPC that can be used for other purposes (a CloudFormation template will be triggered to create a VPC, Subnets, IGW and other related components if selected).
There is a new option added in the ECS Instances panel which enables users to register external instances.
Registering an External Instance
In the test topology below a raspberry-pi running ubuntu-20.04 connected to private home network is registered to the cluster (ecs-anywhere-cluster) created above.
All external instances require an IAM role that permits them to communicate with AWS APIs. When registering an on-premise server or virtual machine (VM) to your cluster, the server or VM requires an IAM role to communicate with AWS APIs. Users only need to create this IAM role once per AWS account, but it must be associated with each server or VM you register to your clusters. This role is referred to as the ECSAnywhereRole. The role can be created manually or Amazon ECS can create the role on your behalf when registering an external instance using the AWS Management Console.
The activation key will have a specific validity and the number of instances that can be activated/registered with the same. All the actions below can be done using the SDK/CLI.

Register external instances command:

The command above installs three components ECS Agent (amazon-ecs-agent) as a Docker container, SSM Agent (amazon-ssm-agent) as a Systemd Service on the host and Docker engine (moby) if not installed on the host. The script automatically installs and configures the SSM Agent, Docker, and the Amazon ECS agent without any further input necessary.
Amazon ssm-agent service:
ECS Agent docker container:
Once registered the instance appears in the ECS instances section of the ECS cluster portal.
The external instances are named using 'mi' which refers to managed instances (general EC2 container-instances are named 'i-***'). Users can get the basic resource information of the registered instances, agent & docker versions, running tasks, etc. from the instance information section.

AWS Systems Manager

Amazon ECS Anywhere requires the AWS Systems Manager Agent (SSM Agent) to authenticate and register your on-premises instances. AWS Systems Manager provides all capabilities to manage the external instances where the ECS Anywhere tasks are run.
To use AWS Systems Manager - Session Manager to connect to the instances, the instances should be in advanced tier, this configuration can be done from Fleet Manager section of the AWS Systems Manager. A auto update of SSM Agent can also be performed from this section.
Activations section lists all the advanced nodes:
AWS Systems Manager also provides a compliance management portal where users can trace any external instance compliance related issues.
Run Command enables users to automate common administrative tasks and perform one-time configuration changes at scale on external/managed instances. A useful action in this scenario is updating SSM and ECS agents on the external instances.
State Manager, a capability of AWS Systems Manager, is a secure and scalable service that automates the process of keeping registered instances in a state that a user define. State Manager associations runs on the target systems initially and after that it runs in regular intervals to maintain the system in the defined state.
Fleet manager is a unified portal that enables users to remotely manage server fleet running on AWS, or on premises. With Fleet Manager, users can view the health and performance status of your entire server fleet from one console and can also gather data from individual instances to perform common troubleshooting and management tasks from the console.
Session Manager provides secure and auditable instance management without the need to open inbound ports, maintain bastion hosts, or manage SSH keys.
Users can configure KMS encryption for the sessions, CloudWatch for logging sessions and CloudTrial to trace the API calls.
When any managed instance is selected, the users will be diverted to fleet manager portal. In case of general container-instances (EC2 launch type) the same will be redirected to EC2 console.
Users can select 'start session' to connect to a managed instance (provides an interactive one-click browser-based shell). This even works for devices that are behind Network Address Translation, with only a private IP address. This is because the SSM agent on the host opens a control channel back to SSM. This portal can be used to both monitor the managed instance, as well as open an AWS SSM Session Manager session to it.
All the session information is available in session history of the session manager. This assists in complying to organizational security policies and audit needs by maintaining a record of the connections made to the instances and the commands that were run on them.
Try it Out - Deploying a Multi-Tiered Object Detection Application
In the sample topology below two test apps camera-app and object-detection-app are deployed on registered external instance (raspberry Pi). Both the applications are deployed using AWS ECS task definitions (Task definition is a text file (in JSON format) that describes one or more containers (up to a maximum of ten) that form an application. The task definition can be thought of as a blueprint for the application - homologous to docker-compose file).
A raspberry camera module is attached to the Pi which will be used by example camera application to stream live-footage.
The launch type now includes a new 'External' type which should be used for ECS Anywhere to deploy containers/task-definitions to external/managed instances. The task and container definitions will be configured to use external instances, there is a limitation here when the the launch_type: external load balancers, tag propagation, and service discovery integration are not supported.
Task and container definitions set to 'EXTERNAL' (compatibilities: the task launch types the task definition validated against during task definition registration). When creating a task definition users can select 'Network Mode' (native docker networking modes: bridge, host, none), awsvpc mode will not come into the picture as the container-instances here are external (not connected to VPC like EC2).
As shown below a task definition is created with 'EXTERNAL' capabilities which deploys two containers of the camera-app, the first container is a camera-streamer which streams the live-footage over a webapp and allows users to take snapshots, the other container droppy is a minimal file-sharing server.
The above task definition is created using Run Task or Start Task (if user wants to use an own scheduler or place tasks manually on specific hosts).
Configure service enables users to configure the replicas of service and ASG. load balancing is not supported in launch type 'EXTERNAL'.
ECS Agent running on raspberry PI processing the task submitted above:
Tasks section of the ECS cluster showing the running tasks:
The task detail section showing the running containers:
AWS Systems Manager Session to raspberry PI (in private network behind NAT):
Accessing container logs through terminal opened using Session Manager:
As shown below the live-stream webpage can be accessed using the private_ip address of raspberry Pi and the corresponding port.
An other task definition is created to deploy a sample object detection application (Max-object-detector) which consumes the live-stream and snapshots from the camera-app above.
Sample object detection application deployed on the external instance (raspberry Pi):
Users in the private network can now access the object-detection application using the raspberry Pi private_ip or DNS if a local ipam is managed. The application container is deployed on the external/managed instance deployed from ECS using the task definition.
The sample application consumes the real-time video or snapshots provided by the other camera application which stores the files in droppy file-server. Sample detection runs are below:

Exposing Private Services on External Instances - AWS Site-to-Site VPN

The next significant requirement is to enable users to access the services hosted in managed/external instances located in private network space over the internet. On a simple test topology like this we can simply configure port-forwarding on the home router for inbound traffic which sends traffic to the home IP address and have it forwarded to one of the devices connected to the network.
This is not practical in many scenarios and not scalable in production setting. Most enterprise use-cases need connectivity back to resources running inside an Amazon VPC (for example, a web application running on external instance accessing a DB service on AWS). AWS Direct Connect can be used to create a production grade connectivity which is not feasible for smaller environments. As a reduced-cost alternative, AWS Site-to-Site VPN can be used to enable connectivity between AWS and on-premises external instance over IPSEC.
A VPN gateway is created on AWS and attached to respective VPC. A route is added to the VPC route tables that directs all traffic that is addressed to the on-premises network CIDR range out via the VPN gateway.
strongSwan deployed on the external instance (raspberry Pi - ubuntu 20.4) is used as a corresponding self-managed VPN gateway. As the raspberry Pi is connected to a private network behind NAT (home router), the public_ip of home router is used as the IP for customer gateway. A route is added on Pi to direct any traffic addressed to the Amazon VPC CIDR range to the on-premises end of the VPN gateway (tunnel interface).
A Site-to-Site VPN Connection is configured:
A load balancer (NLB) is created in AWS region and the VPC configured with AWS Site-to-Site VPN is associated. The associated Target Group is configured with the private-ip address of the Pi (10.0.0.188) and the application port 5000 exposed by the docker-proxy on the host, with this configuration the load balancer can serve traffic using its own IP address, and it can send traffic to the private IP addresses of your devices inside your home network via the VPN gateway.
Target groups associated with the load balancer above:
With this configuration in place, users can access the sample object detection application using the FQDN of the load balancer over internet without having to configure DNS pointing at the address of the home network. The home network will be protected behind the VPN connection.
In a real scenario, all the applications are deployed and managed using ECS console. The camera-app provides snapshots and real-time video stored in droppy file server, the object detection app consumes the same and renders the detection results. Users can access the application deployed in private network and a customer managed instance/host similar to any other service hosted on AWS.

Monitoring and Logging

Amazon CloudWatch can be used with ECS Anywhere to get metrics for the cluster instances and services running on them. CloudWatch Logs driver (awslogs) provides the containers’ logs, and cluster events can be monitored from CloudWatch event stream. CloudWatch Container Insights collects, aggregates, and summarizes metrics and logs from the containerized applications and microservices.
CloudWatch container insights - Performance monitoring - Cluster level monitoring:
CloudWatch container insights - Performance monitoring - Task level monitoring:
Resource level (container level) monitoring:
Container Insights - Container map:
By extending AWS container orchestration to customer-managed infrastructure, customers gain operational control without compromising on the value of the AWS fully managed, easy to use, control plane that’s running in the cloud, and always up to date.
There are other use cases like cloud bursting, processing workloads at the edge and containerize existing on-premises workloads where ECS Anywhere plays a prominent role. Based on the roadmap, ECS Anywhere combined with other services in AWS ecosystem in the future will definitely make a significant impact in hybrid cloud space.

Written by gokulchandrapr | Cloud Engineer
Published by HackerNoon on 2021/06/03