In this article, I will walk you through all the steps required to perform canary deployments on Amazon ECS / Fargate with AWS App Mesh.
Canary deployments are a pattern for rolling out releases to a subset of users or servers. In this way, new features and other updates can be tested before it goes live for the entire user base.
In this example, we are going to deal with a Flask restful api. Once the api application is signed off for new release, only a few users are routed to new version. If no errors reported, we will roll out the new version to the rest of the users.
/api
- api handler/api-gateway
- api gatewayFirst, We will setup a VPN with public subnets. If you like to set up ECS tasks in VPC with private subnets and NAT gateway, please read this tutorial from AWS team.
Now, let’s start to a CloudFormation template
ecs-vpc.yaml
:Description: >
A stack for deploying containerized applications in AWS Fargate.
This stack runs containers in public VPC subnet, and includes a
public facing load balancer to register the services in.
Parameters:
EnvironmentName:
Description: An environment name that will be prefixed to resource names
Type: String
Default: flask
ECSServiceLogGroupRetentionInDays:
Type: Number
Default: 30
ECSServicesDomain:
Type: String
Description: "Domain name registerd under Route-53 that will be used for Service Discovery"
Default: flask.sample
Mappings:
SubnetConfig:
VPC:
CIDR: "10.0.0.0/16"
PublicOne:
CIDR: "10.0.0.0/24"
PublicTwo:
CIDR: "10.0.1.0/24"
Resources:
VPC:
Type: AWS::EC2::VPC
Properties:
EnableDnsSupport: true
EnableDnsHostnames: true
CidrBlock: !FindInMap ["SubnetConfig", "VPC", "CIDR"]
PublicSubnetOne:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone:
Fn::Select:
- 0
- Fn::GetAZs: { Ref: "AWS::Region" }
VpcId: !Ref "VPC"
CidrBlock: !FindInMap ["SubnetConfig", "PublicOne", "CIDR"]
MapPublicIpOnLaunch: true
PublicSubnetTwo:
Type: AWS::EC2::Subnet
Properties:
AvailabilityZone:
Fn::Select:
- 1
- Fn::GetAZs: { Ref: "AWS::Region" }
VpcId: !Ref "VPC"
CidrBlock: !FindInMap ["SubnetConfig", "PublicTwo", "CIDR"]
MapPublicIpOnLaunch: true
InternetGateway:
Type: AWS::EC2::InternetGateway
GatewayAttachement:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
VpcId: !Ref "VPC"
InternetGatewayId: !Ref "InternetGateway"
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref "VPC"
PublicRoute:
Type: AWS::EC2::Route
DependsOn: GatewayAttachement
Properties:
RouteTableId: !Ref "PublicRouteTable"
DestinationCidrBlock: "0.0.0.0/0"
GatewayId: !Ref "InternetGateway"
PublicSubnetOneRouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnetOne
RouteTableId: !Ref PublicRouteTable
PublicSubnetTwoRouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnetTwo
RouteTableId: !Ref PublicRouteTable
ECSCluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Ref EnvironmentName
FargateContainerSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the Fargate containers
VpcId: !Ref "VPC"
EcsSecurityGroupIngressFromPublicALB:
Type: AWS::EC2::SecurityGroupIngress
Properties:
Description: Ingress from the public ALB
GroupId: !Ref "FargateContainerSecurityGroup"
IpProtocol: -1
SourceSecurityGroupId: !Ref "PublicLoadBalancerSG"
EcsSecurityGroupIngressFromSelf:
Type: AWS::EC2::SecurityGroupIngress
Properties:
Description: Ingress from other containers in the same security group
GroupId: !Ref "FargateContainerSecurityGroup"
IpProtocol: -1
SourceSecurityGroupId: !Ref "FargateContainerSecurityGroup"
PublicLoadBalancerSG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the public facing load balancer
VpcId: !Ref "VPC"
SecurityGroupIngress:
- CidrIp: 0.0.0.0/0
IpProtocol: -1
PublicLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Scheme: internet-facing
LoadBalancerAttributes:
- Key: idle_timeout.timeout_seconds
Value: "30"
Subnets:
- !Ref PublicSubnetOne
- !Ref PublicSubnetTwo
SecurityGroups: [!Ref "PublicLoadBalancerSG"]
TargetGroupPublic:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
TargetType: ip
HealthCheckIntervalSeconds: 6
HealthCheckPath: /ping
HealthCheckProtocol: HTTP
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 2
Name: "api"
Port: 3000
Protocol: HTTP
UnhealthyThresholdCount: 2
VpcId: !Ref "VPC"
PublicLoadBalancerListener:
Type: AWS::ElasticLoadBalancingV2::Listener
DependsOn:
- PublicLoadBalancer
Properties:
DefaultActions:
- TargetGroupArn: !Ref "TargetGroupPublic"
Type: "forward"
LoadBalancerArn: !Ref "PublicLoadBalancer"
Port: 80
Protocol: HTTP
TaskIamRole:
Type: AWS::IAM::Role
Properties:
Path: /
AssumeRolePolicyDocument: |
{
"Statement": [{
"Effect": "Allow",
"Principal": { "Service": [ "ecs-tasks.amazonaws.com" ]},
"Action": [ "sts:AssumeRole" ]
}]
}
ManagedPolicyArns:
- arn:aws:iam::aws:policy/CloudWatchFullAccess
- arn:aws:iam::aws:policy/AWSXRayDaemonWriteAccess
TaskExecutionIamRole:
Type: AWS::IAM::Role
Properties:
Path: /
AssumeRolePolicyDocument: |
{
"Statement": [{
"Effect": "Allow",
"Principal": { "Service": [ "ecs-tasks.amazonaws.com" ]},
"Action": [ "sts:AssumeRole" ]
}]
}
Policies:
- PolicyName: AmazonECSTaskExecutionRolePolicy
PolicyDocument:
Statement:
- Effect: Allow
Action:
- "ecr:GetAuthorizationToken"
- "ecr:BatchCheckLayerAvailability"
- "ecr:GetDownloadUrlForLayer"
- "ecr:GetRepositoryPolicy"
- "ecr:DescribeRepositories"
- "ecr:ListImages"
- "ecr:DescribeImages"
- "ecr:BatchGetImage"
- "logs:CreateLogStream"
- "logs:PutLogEvents"
Resource: "*"
ECSServiceLogGroup:
Type: "AWS::Logs::LogGroup"
Properties:
RetentionInDays:
Ref: ECSServiceLogGroupRetentionInDays
ECSServiceDiscoveryNamespace:
Type: AWS::ServiceDiscovery::PrivateDnsNamespace
Properties:
Vpc: !Ref "VPC"
Name: { Ref: ECSServicesDomain }
Outputs:
Cluster:
Description: A reference to the ECS cluster
Value: !Ref ECSCluster
Export:
Name: !Sub "${EnvironmentName}:ECSCluster"
ECSServiceDiscoveryNamespace:
Description: A SDS namespace that will be used by all services in this cluster
Value: !Ref ECSServiceDiscoveryNamespace
Export:
Name: !Sub "${EnvironmentName}:ECSServiceDiscoveryNamespace"
ECSServiceLogGroup:
Description: Log group for services to publish logs
Value: !Ref ECSServiceLogGroup
Export:
Name: !Sub "${EnvironmentName}:ECSServiceLogGroup"
PublicLoadBalancerSG:
Description: Log group for public LoadBalancer
Value: !Ref PublicLoadBalancerSG
Export:
Name: !Sub "${EnvironmentName}:PublicLoadBalancerSG"
FargateContainerSecurityGroup:
Description: Security group to be used by all services in the cluster
Value: !Ref FargateContainerSecurityGroup
Export:
Name: !Sub "${EnvironmentName}:FargateContainerSecurityGroup"
TaskExecutionIamRoleArn:
Description: Task Executin IAM role used by ECS tasks
Value: { "Fn::GetAtt": TaskExecutionIamRole.Arn }
Export:
Name: !Sub "${EnvironmentName}:TaskExecutionIamRoleArn"
TaskIamRoleArn:
Description: IAM role to be used by ECS task
Value: { "Fn::GetAtt": TaskIamRole.Arn }
Export:
Name: !Sub "${EnvironmentName}:TaskIamRoleArn"
PublicListener:
Description: The ARN of the public load balancer's Listener
Value: !Ref PublicLoadBalancerListener
Export:
Name: !Sub "${EnvironmentName}:PublicListener"
VPCId:
Description: The ID of the VPC that this stack is deployed in
Value: !Ref "VPC"
Export:
Name: !Sub "${EnvironmentName}:VPCId"
PublicSubnetOne:
Description: Public subnet one
Value: !Ref "PublicSubnetOne"
Export:
Name: !Sub "${EnvironmentName}:PublicSubnetOne"
PublicSubnetTwo:
Description: Public subnet two
Value: !Ref "PublicSubnetTwo"
Export:
Name: !Sub "${EnvironmentName}:PublicSubnetTwo"
TargetGroupPublic:
Description: ALB public target group
Value: !Ref "TargetGroupPublic"
Export:
Name: !Sub "${EnvironmentName}:TargetGroupPublic"
Then run the
aws cloudformation
create-stack command to create a stack:$ aws cloudformation create-stack --stack-name flask-sample --template-body file://ecs-vpc.yaml --profile YOUR_PROFILE --region YOUR_REGION
AWS App Mesh is a service mesh that provides application-level networking to make it easy for your services to communicate with each other across multiple types of compute infrastructure. App Mesh standardizes how your services communicate, giving you end-to-end visibility and ensuring high-availability for your applications.
App Mesh perspective of the Flask api sample
The following CF template
app-mesh.yaml
will be used to create an mesh, virtual service, virtual router, corresponding route and virtual nodes for our api application:Parameters:
EnvironmentName:
Type: String
Description: Environment name that joins all the stacks
Default: flask
ServicesDomain:
Type: String
Description: DNS namespace used by services e.g. default.svc.cluster.local
Default: flask.sample
AppMeshMeshName:
Type: String
Description: Name of mesh
Default: flask-mesh
Resources:
Mesh:
Type: AWS::AppMesh::Mesh
Properties:
MeshName: !Ref AppMeshMeshName
ApiV1VirtualNode:
Type: AWS::AppMesh::VirtualNode
Properties:
MeshName: !GetAtt Mesh.MeshName
VirtualNodeName: api-vn
Spec:
Listeners:
- PortMapping:
Port: 3000
Protocol: http
HealthCheck:
Protocol: http
Path: "/ping"
HealthyThreshold: 2
UnhealthyThreshold: 2
TimeoutMillis: 2000
IntervalMillis: 5000
ServiceDiscovery:
DNS:
Hostname: !Sub "api.${ServicesDomain}"
ApiV2VirtualNode:
Type: AWS::AppMesh::VirtualNode
Properties:
MeshName: !GetAtt Mesh.MeshName
VirtualNodeName: api-v2-vn
Spec:
Listeners:
- PortMapping:
Port: 3000
Protocol: http
HealthCheck:
Protocol: http
Path: "/ping"
HealthyThreshold: 2
UnhealthyThreshold: 2
TimeoutMillis: 2000
IntervalMillis: 5000
ServiceDiscovery:
DNS:
Hostname: !Sub "api-v2.${ServicesDomain}"
ApiVirtualRouter:
Type: AWS::AppMesh::VirtualRouter
Properties:
MeshName: !GetAtt Mesh.MeshName
VirtualRouterName: api-vr
Spec:
Listeners:
- PortMapping:
Port: 3000
Protocol: http
ApiRoute:
Type: AWS::AppMesh::Route
DependsOn:
- ApiVirtualRouter
- ApiV1VirtualNode
- ApiV2VirtualNode
Properties:
MeshName: !Ref AppMeshMeshName
VirtualRouterName: api-vr
RouteName: api-route
Spec:
HttpRoute:
Action:
WeightedTargets:
- VirtualNode: api-vn
Weight: 1
- VirtualNode: api-v2-vn
Weight: 0
Match:
Prefix: "/"
ApiVirtualService:
Type: AWS::AppMesh::VirtualService
DependsOn:
- ApiVirtualRouter
Properties:
MeshName: !GetAtt Mesh.MeshName
VirtualServiceName: !Sub "api.${ServicesDomain}"
Spec:
Provider:
VirtualRouter:
VirtualRouterName: api-vr
ApiGatewayVirtualNode:
Type: AWS::AppMesh::VirtualNode
DependsOn:
- ApiVirtualService
Properties:
MeshName: !GetAtt Mesh.MeshName
VirtualNodeName: gateway-vn
Spec:
Listeners:
- PortMapping:
Port: 3000
Protocol: http
ServiceDiscovery:
DNS:
Hostname: !Sub "gateway.${ServicesDomain}"
Backends:
- VirtualService:
VirtualServiceName: !Sub "api.${ServicesDomain}"
Run the aws cloudformation create-stack command to create the mesh stack:
$ aws cloudformation create-stack --stack-name flask-app-mesh --template-body file://app-mesh.yaml --profile YOUR_PROFILE --region YOUR_REGION
Before we can deploy our services, we will need to deploy the docker images to ECR image repositories so that ECS would use them to create Task definition.
Go to api api/ directory , create bash script
setup-ecr.sh
to deploy api image to ECR api repository:#!/bin/bash
set -ex
AWS_DEFAULT_REGION="YOUR_REGION"
AWS_PROFILE="YOU_PROFILE"
docker build -t flask-api .
API_IMAGE="$( aws ecr create-repository --repository-name flask-api \
--region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \
--query '[repository.repositoryUri]' --output text || aws ecr describe-repositories --repository-name flask-api \
--region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \
--query '[repositories[0].repositoryUri]' --output text)"
docker tag flask-api ${API_IMAGE}
$(aws ecr get-login --no-include-email --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE})
docker push ${API_IMAGE}
$ ./api/setup-ecr.sh
Then move to api-gateway/ directory, create a bash script to deploy gateway image to ECR gateway repository:
#!/bin/bash
set -ex
AWS_DEFAULT_REGION="YOUR_REGION"
AWS_PROFILE="YOU_PROFILE"
docker build -t flask-gateway .
GATEWAY_IMAGE="$( aws ecr create-repository --repository-name flask-gateway \
--region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \
--query '[repository.repositoryUri]' --output text || aws ecr describe-repositories --repository-name flask-gateway \
--region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \
--query '[repositories[0].repositoryUri]' --output text)" \
echo ${GATEWAY_IMAGE}
docker tag flask-gateway ${GATEWAY_IMAGE}
$(aws ecr get-login --no-include-email --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE})
docker push ${GATEWAY_IMAGE}
$ ./api-gateway/setup-ecr.sh
Task Definition is a blueprint that describes how a docker container should launch. We will need to create ECS task definitions for our gateway and api handlers and make tasks to be compatible with App Mesh and Xray.
Below is example JSON for Amazon ECS task definition of Flask gateway:
{
"family": "gateway",
"proxyConfiguration": {
"type": "APPMESH",
"containerName": "envoy",
"properties": [
{
"name": "IgnoredUID",
"value": "1337"
},
{
"name": "ProxyIngressPort",
"value": "15000"
},
{
"name": "ProxyEgressPort",
"value": "15001"
},
{
"name": "AppPorts",
"value": "9080"
},
{
"name": "EgressIgnoredIPs",
"value": "169.254.170.2,169.254.169.254"
}
]
},
"containerDefinitions": [
{
"name": "app",
"image": $APP_IMAGE,
"portMappings": [
{
"containerPort": 3000,
"hostPort": 3000,
"protocol": "tcp"
}
],
"environment": [
{
"name": "API_ENDPOINT",
"value": "http://api.flask.sample"
},
{
"name": "SERVER_PORT",
"value": "3000"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": $SERVICE_LOG_GROUP,
"awslogs-region": "ap-southeast-2",
"awslogs-stream-prefix": "gateway"
}
},
"essential": true,
"dependsOn": [
{
"containerName": "envoy",
"condition": "HEALTHY"
}
]
},
{
"name": "envoy",
"image": "111345817488.dkr.ecr.us-west-2.amazonaws.com/aws-appmesh-envoy:v1.9.1.0-prod",
"user": "1337",
"essential": true,
"ulimits": [
{
"name": "nofile",
"hardLimit": 15000,
"softLimit": 15000
}
],
"portMappings": [
{
"containerPort": 9901,
"hostPort": 9901,
"protocol": "tcp"
},
{
"containerPort": 15000,
"hostPort": 15000,
"protocol": "tcp"
},
{
"containerPort": 15001,
"hostPort": 15001,
"protocol": "tcp"
}
],
"environment": [
{
"name": "APPMESH_VIRTUAL_NODE_NAME",
"value": "mesh/flask-mesh/virtualNode/gateway-vn"
},
{
"name": "ENVOY_LOG_LEVEL",
"value": $ENVOY_LOG_LEVEL
},
{
"name": "ENABLE_ENVOY_XRAY_TRACING",
"value": "1"
},
{
"name": "ENABLE_ENVOY_STATS_TAGS",
"value": "1"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": $SERVICE_LOG_GROUP,
"awslogs-region": "ap-southeast-2",
"awslogs-stream-prefix": "gateway-envoy"
}
},
"healthCheck": {
"command": [
"CMD-SHELL",
"curl -s http://localhost:9901/server_info | grep state | grep -q LIVE"
],
"interval": 5,
"timeout": 2,
"retries": 3
}
},
{
"name": "xray-daemon",
"image": "amazon/aws-xray-daemon",
"user": "1337",
"essential": true,
"cpu": 32,
"memoryReservation": 256,
"portMappings": [
{
"hostPort": 2000,
"containerPort": 2000,
"protocol": "udp"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": $SERVICE_LOG_GROUP,
"awslogs-region": "ap-southeast-2",
"awslogs-stream-prefix": "gateway-xray"
}
}
}
],
"taskRoleArn": $TASK_ROLE_ARN,
"executionRoleArn": $EXECUTION_ROLE_ARN,
"requiresCompatibilities": ["FARGATE", "EC2"],
"networkMode": "awsvpc",
"cpu": "256",
"memory": "512"
}
Now, we need to create a bash script setup-task-def.sh to create api gateway task definition, then run command
./setup-task-def.sh
:#!/bin/bash
set -ex
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null && pwd )"
AWS_DEFAULT_REGION="YOUR_REGION"
AWS_PROFILE="YOUR_PROFILE"
cluster_stack_output=$(aws --profile "${AWS_PROFILE}" --region "${AWS_DEFAULT_REGION}" \
cloudformation describe-stacks --stack-name "flask-sample" \
| jq '.Stacks[].Outputs[]')
task_role_arn=($(echo $cluster_stack_output \
| jq -r 'select(.OutputKey == "TaskIamRoleArn") | .OutputValue'))
echo ${task_role_arn}
execution_role_arn=($(echo $cluster_stack_output \
| jq -r 'select(.OutputKey == "TaskExecutionIamRoleArn") | .OutputValue'))
ecs_service_log_group=($(echo $cluster_stack_output \
| jq -r 'select(.OutputKey == "ECSServiceLogGroup") | .OutputValue'))
envoy_log_level="debug"
GATEWAY_IMAGE="$( aws ecr describe-repositories \
--repository-name flask-gateway --region ${AWS_DEFAULT_REGION} \
--profile ${AWS_PROFILE} --query '[repositories[0].repositoryUri]' --output text)"
#Gateway Task Definition
task_def_json=$(jq -n \
--arg APP_IMAGE $GATEWAY_IMAGE \
--arg SERVICE_LOG_GROUP $ecs_service_log_group \
--arg TASK_ROLE_ARN $task_role_arn \
--arg EXECUTION_ROLE_ARN $execution_role_arn \
--arg ENVOY_LOG_LEVEL $envoy_log_level \
-f "${DIR}/task-definition-gateway.json")
task_def_arn=$(aws --profile "${AWS_PROFILE}" --region "${AWS_DEFAULT_REGION}" \
ecs register-task-definition \
--cli-input-json "${task_def_json}" \
--query [taskDefinition.taskDefinitionArn] --output text
)
The task definition for api v1 and v2 is very similar as above, you can find bash scripts in GitHub repo/api/.
The command to create the ECS service takes a few parameters so it is easier to use CloudFormation template as input. Let’s create a
ecs-service.yaml
file with the following:Parameters:
EnvironmentName:
Type: String
Description: Environment name that joins all the stacks
Default: flask
AppMeshMeshName:
Type: String
Description: Name of mesh
Default: flask-mesh
ECSServicesDomain:
Type: String
Description: DNS namespace used by services e.g. default.svc.cluster.local
Default: flask.sample
GatewayTaskDefinition:
Type: String
Description: Task definition for Gateway Service
ApiV1TaskDefinition:
Type: String
Description: Task definition for Api v1
ApiV2TaskDefinition:
Type: String
Description: Task definition for Api v2
VpcCIDR:
Description: Please enter the IP range (CIDR notation) for this VPC
Type: String
Default: 10.0.0.0/16
Resources:
ECSServiceSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: "Security group for the service"
VpcId:
"Fn::ImportValue": !Sub "${EnvironmentName}:VPCId"
SecurityGroupIngress:
- CidrIp: !Ref VpcCIDR
IpProtocol: -1
ApiV1ServiceDiscoveryRecord:
Type: "AWS::ServiceDiscovery::Service"
Properties:
Name: "api"
DnsConfig:
NamespaceId:
"Fn::ImportValue": !Sub "${EnvironmentName}:ECSServiceDiscoveryNamespace"
DnsRecords:
- Type: A
TTL: 300
HealthCheckCustomConfig:
FailureThreshold: 1
ApiV1Service:
Type: "AWS::ECS::Service"
Properties:
ServiceName: "api"
Cluster:
"Fn::ImportValue": !Sub "${EnvironmentName}:ECSCluster"
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 100
DesiredCount: 1
LaunchType: FARGATE
ServiceRegistries:
- RegistryArn:
"Fn::GetAtt": ApiV1ServiceDiscoveryRecord.Arn
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
SecurityGroups:
- !Ref ECSServiceSecurityGroup
Subnets:
- "Fn::ImportValue": !Sub "${EnvironmentName}:PublicSubnetOne"
- "Fn::ImportValue": !Sub "${EnvironmentName}:PublicSubnetTwo"
TaskDefinition: { Ref: ApiV1TaskDefinition }
ApiV2ServiceDiscoveryRecord:
Type: "AWS::ServiceDiscovery::Service"
Properties:
Name: "api-v2"
DnsConfig:
NamespaceId:
"Fn::ImportValue": !Sub "${EnvironmentName}:ECSServiceDiscoveryNamespace"
DnsRecords:
- Type: A
TTL: 300
HealthCheckCustomConfig:
FailureThreshold: 1
GatewayServiceDiscoveryRecord:
Type: "AWS::ServiceDiscovery::Service"
Properties:
Name: "gateway"
DnsConfig:
NamespaceId:
"Fn::ImportValue": !Sub "${EnvironmentName}:ECSServiceDiscoveryNamespace"
DnsRecords:
- Type: A
TTL: 300
HealthCheckCustomConfig:
FailureThreshold: 1
ApiV2Service:
Type: "AWS::ECS::Service"
Properties:
ServiceName: "api-v2"
Cluster:
"Fn::ImportValue": !Sub "${EnvironmentName}:ECSCluster"
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 100
DesiredCount: 1
LaunchType: FARGATE
ServiceRegistries:
- RegistryArn:
"Fn::GetAtt": ApiV2ServiceDiscoveryRecord.Arn
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
SecurityGroups:
- !Ref ECSServiceSecurityGroup
Subnets:
- "Fn::ImportValue": !Sub "${EnvironmentName}:PublicSubnetOne"
- "Fn::ImportValue": !Sub "${EnvironmentName}:PublicSubnetTwo"
TaskDefinition: { Ref: ApiV2TaskDefinition }
GatewayService:
Type: "AWS::ECS::Service"
Properties:
ServiceName: "gateway"
Cluster:
"Fn::ImportValue": !Sub "${EnvironmentName}:ECSCluster"
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 100
DesiredCount: 1
LaunchType: FARGATE
ServiceRegistries:
- RegistryArn:
"Fn::GetAtt": GatewayServiceDiscoveryRecord.Arn
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
SecurityGroups:
- !Ref ECSServiceSecurityGroup
Subnets:
- "Fn::ImportValue": !Sub "${EnvironmentName}:PublicSubnetOne"
- "Fn::ImportValue": !Sub "${EnvironmentName}:PublicSubnetTwo"
TaskDefinition: { Ref: GatewayTaskDefinition }
LoadBalancers:
- ContainerName: app
ContainerPort: 3000
TargetGroupArn:
"Fn::ImportValue": !Sub "${EnvironmentName}:TargetGroupPublic"
Next, create a bash script
ecs-services-stack.sh
:#!/bin/bash
set -ex
AWS_DEFAULT_REGION="YOUR_REGION"
AWS_PROFILE="YOUR_PROFILE"
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null && pwd )"
task_api_arn=$(aws ecs list-task-definitions --family-prefix api \
--region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \
--sort DESC \
--query '[taskDefinitionArns[0]]' --output text)
task_api_v2_arn=$(aws ecs list-task-definitions --family-prefix api-v2 \
--region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \
--sort DESC \
--query '[taskDefinitionArns[0]]' --output text)
task_gateway_arn=$(aws ecs list-task-definitions --family-prefix gateway \
--region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \
--sort DESC \
--query '[taskDefinitionArns[0]]' --output text)
aws cloudformation --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \
deploy --stack-name "flask-ecs-service" \
--capabilities CAPABILITY_IAM \
--template-file "${DIR}/ecs-services.yaml" \
--parameter-overrides \
GatewayTaskDefinition="${task_gateway_arn}" \
ApiV1TaskDefinition="${task_api_arn}" \
ApiV2TaskDefinition="${task_api_v2_arn}"
and run following command:
$ ./ecs-services-stack.sh
Now that we have setup everything we need. We can go to AWS console to review what we have created.
Once we have deployed our api application, we can curl the frontend service (gateway) to test. To get the endpoint, open the AWS EC2 console, on the navigation pane, under LOAD BALANCING, choose Load Balancers and select load balancer we just created, find the DNS name which is the endpoint, and run curl command:
$ curl flask-Publi-xxxxx-xxxxx.ap-southeast-2.elb.amazonaws.com/todos/todo1
{
"todo": {
"task": "build an API"
},
"version": "1"
}
Notice that all the services of the application are reflecting version 1. Now, it’s time for us to perform a canary deployment of api v2.
We can manage the target weight (WeightedTargets) in
app-mesh.yaml
ApiRoute resource as below:ApiRoute:
Type: AWS::AppMesh::Route
DependsOn:
- ApiVirtualRouter
- ApiV1VirtualNode
- ApiV2VirtualNode
Properties:
MeshName: !Ref AppMeshMeshName
VirtualRouterName: api-vr
RouteName: api-route
Spec:
HttpRoute:
Action:
WeightedTargets:
- VirtualNode: api-vn
Weight: 2
- VirtualNode: api-v2-vn
Weight: 1
Match:
Prefix: "/"
and re-deploy the template. Or we can open AWS App Mesh Console, choose Mesh we created, select Virtual Routers, open route and edit, set target traffic weight in Targets section:
AWS X-Ray helps us to monitor and analyze distributed microservice applications through request tracing, providing an end-to-end view of requests traveling through the application so we can identify the root cause of errors and performance issues. We’ll use X-Ray to provide a visual map of how App Mesh is distributing traffic and inspect traffic latency through our routes.
In setting up task definition step, we have already defined X-Ray container in task definitions, however X-Ray can only run in local mode with Fargate, so we need to manually create X-ray segment to track traffic between gateway , api v1& v2.
We will manually create trace id and segment, then pass the trace id and segment id as parent id to api handlers. The
api-gateway/main.py
should look like:from flask import Flask
from flask_restful import Resource, Api, reqparse
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all
from aws_xray_sdk.core.models.traceid import TraceId
from aws_xray_sdk.core.models import http
import requests
import os
import json
import logging
patch_all()
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
API_ENDPOINT = os.environ['API_ENDPOINT']
SERVER_PORT = os.environ['SERVER_PORT']
# xray_recorder.configure(
# sampling=False,
# context_missing='LOG_ERROR',
# plugins=('EC2Plugin', 'ECSPlugin'),
# service='Flask Gateway'
# )
app = Flask(__name__)
api = Api(app)
traceid = TraceId().to_id()
parser = reqparse.RequestParser()
class Ping(Resource):
def get(self):
return {'response': 'ok'}
class TodoList(Resource):
def __init__(self):
self.segment = xray_recorder.begin_segment('gateway_todoS',traceid = traceid, sampling=1)
self.segment.put_http_meta(http.URL, 'gateway.flask.sample')
logger.info("Request todos from gateway")
def __del__(self):
xray_recorder.end_segment()
def get(self):
logger.info("Request todo from gateway, parentid is %s"%(self.segment.id))
r = requests.get(url = '%s:%s/todos'%(API_ENDPOINT,SERVER_PORT), headers={'x-traceid': traceid, 'x-parentid':self.segment.id})
return r.json()
def post(self):
args = parser.parse_args()
r = requests.post(url = '%s:%s/todos'%(API_ENDPOINT,SERVER_PORT), json=args,headers={'x-traceid': traceid,'x-parentid':self.segment.id})
return r.json(), 201
class Todo(Resource):
def __init__(self):
self.segment = xray_recorder.begin_segment('gateway_todo',traceid = traceid, sampling=1)
self.segment.put_http_meta(http.URL, 'gateway.flask.sample')
logger.info("Request todo from gateway")
def __del__(self):
xray_recorder.end_segment()
def get(self, todo_id):
r = requests.get(url = '%s:%s/todos/%s'%(API_ENDPOINT,SERVER_PORT,todo_id),headers={'x-traceid': traceid,'x-parentid':self.segment.id})
return r.json()
def delete(self, todo_id):
r = requests.delete(url = '%s:%s/todos/%s'%(API_ENDPOINT,SERVER_PORT,todo_id),headers={'x-traceid': traceid,'x-parentid':self.segment.id})
return r.json(), 204
def put(self, todo_id):
args = parser.parse_args()
task = {'task': args['task']}
r = requests.put(url = '%s:%s/todos/%s'%(API_ENDPOINT,SERVER_PORT,todo_id), json=task,headers={'x-traceid': traceid,'x-parentid':self.segment.id})
return r.json(), 201
api.add_resource(TodoList, '/todos')
api.add_resource(Todo, '/todos/<todo_id>')
api.add_resource(Ping, '/ping')
if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0', port=3000)
After all services deployed successfully, we can open AWS X-Ray consoleand monitor traffic we’re sending to the application frontend (gateway) when we request api application on the
/todos
route.It is quickest to use the CloudFormation Console to delete the following stacks:
That’s about it! I hope you have found this walkthrough useful, You can find the complete project in my GitHub repo.