In this article, I will walk you through all the steps required to perform canary deployments on Amazon ECS / Fargate with AWS App Mesh. Canary deployments are a pattern for rolling out releases to a subset of users or servers. In this way, new features and other updates can be tested before it goes live for the entire user base. In this example, we are going to deal with a Flask restful api. Once the api application is signed off for new release, only a few users are routed to new version. If no errors reported, we will roll out the new version to the rest of the users. File Structure - api handler /api - api gateway /api-gateway Prerequisites Basic understanding of Docker Basic understanding of CloudFormation Setup an AWS account Install latest Configure aws-cli to support App mesh APIs aws-cli installed jq Creating VPC First, We will setup a VPN with public subnets. If you like to set up ECS tasks in VPC with private subnets and NAT gateway, please read this from AWS team. tutorial Now, let’s start to a CloudFormation template : ecs-vpc.yaml Description: > A stack for deploying containerized applications in AWS Fargate. This stack runs containers in public VPC subnet, and includes a public facing load balancer to register the services in. Parameters: EnvironmentName: Description: An environment name that will be prefixed to resource names Type: String Default: flask ECSServiceLogGroupRetentionInDays: Type: Number Default: 30 ECSServicesDomain: Type: String Description: "Domain name registerd under Route-53 that will be used for Service Discovery" Default: flask.sample Mappings: SubnetConfig: VPC: CIDR: "10.0.0.0/16" PublicOne: CIDR: "10.0.0.0/24" PublicTwo: CIDR: "10.0.1.0/24" Resources: VPC: Type: AWS::EC2::VPC Properties: EnableDnsSupport: true EnableDnsHostnames: true CidrBlock: !FindInMap ["SubnetConfig", "VPC" , "CIDR" ] PublicSubnetOne: Type: AWS::EC2::Subnet Properties: AvailabilityZone: Fn: :Select: - 0 - Fn: :GetAZs: { Ref: "AWS::Region" } VpcId: !Ref "VPC" CidrBlock: !FindInMap ["SubnetConfig", "PublicOne" , "CIDR" ] MapPublicIpOnLaunch: true PublicSubnetTwo: Type: AWS::EC2::Subnet Properties: AvailabilityZone: Fn: :Select: - 1 - Fn: :GetAZs: { Ref: "AWS::Region" } VpcId: !Ref "VPC" CidrBlock: !FindInMap ["SubnetConfig", "PublicTwo" , "CIDR" ] MapPublicIpOnLaunch: true InternetGateway: Type: AWS::EC2::InternetGateway GatewayAttachement: Type: AWS::EC2::VPCGatewayAttachment Properties: VpcId: !Ref "VPC" InternetGatewayId: !Ref "InternetGateway" PublicRouteTable: Type: AWS::EC2::RouteTable Properties: VpcId: !Ref "VPC" PublicRoute: Type: AWS::EC2::Route DependsOn: GatewayAttachement Properties: RouteTableId: !Ref "PublicRouteTable" DestinationCidrBlock: "0.0.0.0/0" GatewayId: !Ref "InternetGateway" PublicSubnetOneRouteTableAssociation: Type: AWS::EC2::SubnetRouteTableAssociation Properties: SubnetId: !Ref PublicSubnetOne RouteTableId: !Ref PublicRouteTable PublicSubnetTwoRouteTableAssociation: Type: AWS::EC2::SubnetRouteTableAssociation Properties: SubnetId: !Ref PublicSubnetTwo RouteTableId: !Ref PublicRouteTable ECSCluster: Type: AWS::ECS::Cluster Properties: ClusterName: !Ref EnvironmentName FargateContainerSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Access to the Fargate containers VpcId: !Ref "VPC" EcsSecurityGroupIngressFromPublicALB: Type: AWS::EC2::SecurityGroupIngress Properties: Description: Ingress from the public ALB GroupId: !Ref "FargateContainerSecurityGroup" IpProtocol: -1 SourceSecurityGroupId: !Ref "PublicLoadBalancerSG" EcsSecurityGroupIngressFromSelf: Type: AWS::EC2::SecurityGroupIngress Properties: Description: Ingress from other containers in the same security group GroupId: !Ref "FargateContainerSecurityGroup" IpProtocol: -1 SourceSecurityGroupId: !Ref "FargateContainerSecurityGroup" PublicLoadBalancerSG: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Access to the public facing load balancer VpcId: !Ref "VPC" SecurityGroupIngress: - CidrIp: 0.0 .0 .0 /0 IpProtocol: -1 PublicLoadBalancer: Type: AWS::ElasticLoadBalancingV2::LoadBalancer Properties: Scheme: internet-facing LoadBalancerAttributes: - Key: idle_timeout.timeout_seconds Value: "30" Subnets: - !Ref PublicSubnetOne - !Ref PublicSubnetTwo SecurityGroups: [!Ref "PublicLoadBalancerSG" ] TargetGroupPublic: Type: AWS::ElasticLoadBalancingV2::TargetGroup Properties: TargetType: ip HealthCheckIntervalSeconds: 6 HealthCheckPath: /ping HealthCheckProtocol: HTTP HealthCheckTimeoutSeconds: 5 HealthyThresholdCount: 2 Name: "api" Port: 3000 Protocol: HTTP UnhealthyThresholdCount: 2 VpcId: !Ref "VPC" PublicLoadBalancerListener: Type: AWS::ElasticLoadBalancingV2::Listener DependsOn: - PublicLoadBalancer Properties: DefaultActions: - TargetGroupArn: !Ref "TargetGroupPublic" Type: "forward" LoadBalancerArn: !Ref "PublicLoadBalancer" Port: 80 Protocol: HTTP TaskIamRole: Type: AWS::IAM::Role Properties: Path: / AssumeRolePolicyDocument: | { "Statement": [{ "Effect": "Allow", "Principal": { "Service": [ "ecs-tasks.amazonaws.com" ]}, "Action": [ "sts:AssumeRole" ] }] } ManagedPolicyArns: - arn: aws:iam::aws:policy/CloudWatchFullAccess - arn: aws:iam::aws:policy/AWSXRayDaemonWriteAccess TaskExecutionIamRole: Type: AWS::IAM::Role Properties: Path: / AssumeRolePolicyDocument: | { "Statement": [{ "Effect": "Allow", "Principal": { "Service": [ "ecs-tasks.amazonaws.com" ]}, "Action": [ "sts:AssumeRole" ] }] } Policies: - PolicyName: AmazonECSTaskExecutionRolePolicy PolicyDocument: Statement: - Effect: Allow Action: - "ecr:GetAuthorizationToken" - "ecr:BatchCheckLayerAvailability" - "ecr:GetDownloadUrlForLayer" - "ecr:GetRepositoryPolicy" - "ecr:DescribeRepositories" - "ecr:ListImages" - "ecr:DescribeImages" - "ecr:BatchGetImage" - "logs:CreateLogStream" - "logs:PutLogEvents" Resource: "*" ECSServiceLogGroup: Type: "AWS::Logs::LogGroup" Properties: RetentionInDays: Ref: ECSServiceLogGroupRetentionInDays ECSServiceDiscoveryNamespace: Type: AWS::ServiceDiscovery::PrivateDnsNamespace Properties: Vpc: !Ref "VPC" Name: { Ref: ECSServicesDomain } Outputs: Cluster: Description: A reference to the ECS cluster Value: !Ref ECSCluster Export: Name: !Sub "${EnvironmentName}:ECSCluster" ECSServiceDiscoveryNamespace: Description: A SDS namespace that will be used by all services in this cluster Value: !Ref ECSServiceDiscoveryNamespace Export: Name: !Sub "${EnvironmentName}:ECSServiceDiscoveryNamespace" ECSServiceLogGroup: Description: Log group for services to publish logs Value: !Ref ECSServiceLogGroup Export: Name: !Sub "${EnvironmentName}:ECSServiceLogGroup" PublicLoadBalancerSG: Description: Log group for public LoadBalancer Value: !Ref PublicLoadBalancerSG Export: Name: !Sub "${EnvironmentName}:PublicLoadBalancerSG" FargateContainerSecurityGroup: Description: Security group to be used by all services in the cluster Value: !Ref FargateContainerSecurityGroup Export: Name: !Sub "${EnvironmentName}:FargateContainerSecurityGroup" TaskExecutionIamRoleArn: Description: Task Executin IAM role used by ECS tasks Value: { "Fn::GetAtt" : TaskExecutionIamRole.Arn } Export: Name: !Sub "${EnvironmentName}:TaskExecutionIamRoleArn" TaskIamRoleArn: Description: IAM role to be used by ECS task Value: { "Fn::GetAtt" : TaskIamRole.Arn } Export: Name: !Sub "${EnvironmentName}:TaskIamRoleArn" PublicListener: Description: The ARN of the public load balancer's Listener Value: !Ref PublicLoadBalancerListener Export: Name: !Sub "${EnvironmentName}:PublicListener" VPCId: Description: The ID of the VPC that this stack is deployed in Value: !Ref "VPC" Export: Name: !Sub "${EnvironmentName}:VPCId" PublicSubnetOne: Description: Public subnet one Value: !Ref "PublicSubnetOne" Export: Name: !Sub "${EnvironmentName}:PublicSubnetOne" PublicSubnetTwo: Description: Public subnet two Value: !Ref "PublicSubnetTwo" Export: Name: !Sub "${EnvironmentName}:PublicSubnetTwo" TargetGroupPublic: Description: ALB public target group Value: !Ref "TargetGroupPublic" Export: Name: !Sub "${EnvironmentName}:TargetGroupPublic" Then run the create-stack command to create a stack: aws cloudformation $ // aws cloudformation create-stack --stack-name flask-sample --template-body file: ecs-vpc. yaml --profile YOUR_PROFILE --region YOUR_REGION Creating an App Mesh AWS App Mesh is a service mesh that provides application-level networking to make it easy for your services to communicate with each other across multiple types of compute infrastructure. App Mesh standardizes how your services communicate, giving you end-to-end visibility and ensuring high-availability for your applications. App Mesh perspective of the Flask api sample The following CF template will be used to create an mesh, virtual service, virtual router, corresponding route and virtual nodes for our api application: app-mesh.yaml Parameters: EnvironmentName: Type: String Description: Environment name that joins all the stacks Default: flask ServicesDomain: Type: String Description: DNS namespace used by services e.g. default.svc.cluster.local Default: flask.sample AppMeshMeshName: Type: String Description: Name of mesh Default: flask-mesh Resources: Mesh: Type: AWS::AppMesh::Mesh Properties: MeshName: !Ref AppMeshMeshName ApiV1VirtualNode: Type: AWS::AppMesh::VirtualNode Properties: MeshName: !GetAtt Mesh.MeshName VirtualNodeName: api-vn Spec: Listeners: - PortMapping: Port: 3000 Protocol: http HealthCheck: Protocol: http Path: "/ping" HealthyThreshold: 2 UnhealthyThreshold: 2 TimeoutMillis: 2000 IntervalMillis: 5000 ServiceDiscovery: DNS: Hostname: !Sub "api.${ServicesDomain}" ApiV2VirtualNode: Type: AWS::AppMesh::VirtualNode Properties: MeshName: !GetAtt Mesh.MeshName VirtualNodeName: api-v2-vn Spec: Listeners: - PortMapping: Port: 3000 Protocol: http HealthCheck: Protocol: http Path: "/ping" HealthyThreshold: 2 UnhealthyThreshold: 2 TimeoutMillis: 2000 IntervalMillis: 5000 ServiceDiscovery: DNS: Hostname: !Sub "api-v2.${ServicesDomain}" ApiVirtualRouter: Type: AWS::AppMesh::VirtualRouter Properties: MeshName: !GetAtt Mesh.MeshName VirtualRouterName: api-vr Spec: Listeners: - PortMapping: Port: 3000 Protocol: http ApiRoute: Type: AWS::AppMesh::Route DependsOn: - ApiVirtualRouter - ApiV1VirtualNode - ApiV2VirtualNode Properties: MeshName: !Ref AppMeshMeshName VirtualRouterName: api-vr RouteName: api-route Spec: HttpRoute: Action: WeightedTargets: - VirtualNode: api-vn Weight: 1 - VirtualNode: api-v2-vn Weight: 0 Match: Prefix: "/" ApiVirtualService: Type: AWS::AppMesh::VirtualService DependsOn: - ApiVirtualRouter Properties: MeshName: !GetAtt Mesh.MeshName VirtualServiceName: !Sub "api.${ServicesDomain}" Spec: Provider: VirtualRouter: VirtualRouterName: api-vr ApiGatewayVirtualNode: Type: AWS::AppMesh::VirtualNode DependsOn: - ApiVirtualService Properties: MeshName: !GetAtt Mesh.MeshName VirtualNodeName: gateway-vn Spec: Listeners: - PortMapping: Port: 3000 Protocol: http ServiceDiscovery: DNS: Hostname: !Sub "gateway.${ServicesDomain}" Backends: - VirtualService: VirtualServiceName: !Sub "api.${ServicesDomain}" Run the aws cloudformation create-stack command to create the mesh stack: $ // aws cloudformation create-stack --stack-name flask-app- mesh --template-body file: app-mesh. yaml --profile YOUR_PROFILE --region YOUR_REGION Pushing images to ECR Before we can deploy our services, we will need to deploy the docker images to ECR image repositories so that ECS would use them to create Task definition. Go to api api/ directory , create bash script to deploy api image to ECR api repository: setup-ecr.sh -ex AWS_DEFAULT_REGION= AWS_PROFILE= docker build -t flask-api . API_IMAGE= docker tag flask-api $(aws ecr get-login --no-include-email --region --profile ) docker push #!/bin/bash set "YOUR_REGION" "YOU_PROFILE" " " $( aws ecr create-repository --repository-name flask-api \ --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \ --query '[repository.repositoryUri]' --output text || aws ecr describe-repositories --repository-name flask-api \ --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \ --query '[repositories[0].repositoryUri]' --output text) ${API_IMAGE} ${AWS_DEFAULT_REGION} ${AWS_PROFILE} ${API_IMAGE} $ ./api/setup-ecr.sh Then move to api-gateway/ directory, create a bash script to deploy gateway image to ECR gateway repository: -ex AWS_DEFAULT_REGION= AWS_PROFILE= docker build -t flask-gateway . GATEWAY_IMAGE= \ docker tag flask-gateway $(aws ecr get-login --no-include-email --region --profile ) docker push #!/bin/bash set "YOUR_REGION" "YOU_PROFILE" " " $( aws ecr create-repository --repository-name flask-gateway \ --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \ --query '[repository.repositoryUri]' --output text || aws ecr describe-repositories --repository-name flask-gateway \ --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \ --query '[repositories[0].repositoryUri]' --output text) echo ${GATEWAY_IMAGE} ${GATEWAY_IMAGE} ${AWS_DEFAULT_REGION} ${AWS_PROFILE} ${GATEWAY_IMAGE} $ ./api-gateway/setup-ecr.sh Creating Task definition Task Definition is a blueprint that describes how a docker container should launch. We will need to create ECS task definitions for our gateway and api handlers and make tasks to be compatible with App Mesh and Xray. Below is example JSON for Amazon ECS task definition of Flask gateway: { : , : { : , : , : [ { : , : }, { : , : }, { : , : }, { : , : }, { : , : } ] }, : [ { : , : $APP_IMAGE, : [ { : , : , : } ], : [ { : , : }, { : , : } ], : { : , : { : $SERVICE_LOG_GROUP, : , : } }, : , : [ { : , : } ] }, { : , : , : , : , : [ { : , : , : } ], : [ { : , : , : }, { : , : , : }, { : , : , : } ], : [ { : , : }, { : , : $ENVOY_LOG_LEVEL }, { : , : }, { : , : } ], : { : , : { : $SERVICE_LOG_GROUP, : , : } }, : { : [ , ], : , : , : } }, { : , : , : , : , : , : , : [ { : , : , : } ], : { : , : { : $SERVICE_LOG_GROUP, : , : } } } ], : $TASK_ROLE_ARN, : $EXECUTION_ROLE_ARN, : [ , ], : , : , : } "family" "gateway" "proxyConfiguration" "type" "APPMESH" "containerName" "envoy" "properties" "name" "IgnoredUID" "value" "1337" "name" "ProxyIngressPort" "value" "15000" "name" "ProxyEgressPort" "value" "15001" "name" "AppPorts" "value" "9080" "name" "EgressIgnoredIPs" "value" "169.254.170.2,169.254.169.254" "containerDefinitions" "name" "app" "image" "portMappings" "containerPort" 3000 "hostPort" 3000 "protocol" "tcp" "environment" "name" "API_ENDPOINT" "value" "http://api.flask.sample" "name" "SERVER_PORT" "value" "3000" "logConfiguration" "logDriver" "awslogs" "options" "awslogs-group" "awslogs-region" "ap-southeast-2" "awslogs-stream-prefix" "gateway" "essential" true "dependsOn" "containerName" "envoy" "condition" "HEALTHY" "name" "envoy" "image" "111345817488.dkr.ecr.us-west-2.amazonaws.com/aws-appmesh-envoy:v1.9.1.0-prod" "user" "1337" "essential" true "ulimits" "name" "nofile" "hardLimit" 15000 "softLimit" 15000 "portMappings" "containerPort" 9901 "hostPort" 9901 "protocol" "tcp" "containerPort" 15000 "hostPort" 15000 "protocol" "tcp" "containerPort" 15001 "hostPort" 15001 "protocol" "tcp" "environment" "name" "APPMESH_VIRTUAL_NODE_NAME" "value" "mesh/flask-mesh/virtualNode/gateway-vn" "name" "ENVOY_LOG_LEVEL" "value" "name" "ENABLE_ENVOY_XRAY_TRACING" "value" "1" "name" "ENABLE_ENVOY_STATS_TAGS" "value" "1" "logConfiguration" "logDriver" "awslogs" "options" "awslogs-group" "awslogs-region" "ap-southeast-2" "awslogs-stream-prefix" "gateway-envoy" "healthCheck" "command" "CMD-SHELL" "curl -s http://localhost:9901/server_info | grep state | grep -q LIVE" "interval" 5 "timeout" 2 "retries" 3 "name" "xray-daemon" "image" "amazon/aws-xray-daemon" "user" "1337" "essential" true "cpu" 32 "memoryReservation" 256 "portMappings" "hostPort" 2000 "containerPort" 2000 "protocol" "udp" "logConfiguration" "logDriver" "awslogs" "options" "awslogs-group" "awslogs-region" "ap-southeast-2" "awslogs-stream-prefix" "gateway-xray" "taskRoleArn" "executionRoleArn" "requiresCompatibilities" "FARGATE" "EC2" "networkMode" "awsvpc" "cpu" "256" "memory" "512" Now, we need to create a bash script setup-task-def.sh to create api gateway task definition, then run command : ./setup-task-def.sh -ex DIR= >/dev/null && ) YOUR_REGION YOUR_PROFILE debug $( aws ecr describe-repositories \ --repository-name flask-gateway --region \ --profile --query --output text) #!/bin/bash set " " $( cd "$( dirname "${BASH_SOURCE[0]}" ) pwd " AWS_DEFAULT_REGION=" " AWS_PROFILE=" " cluster_stack_output= task_role_arn=( | .OutputValue')) echo execution_role_arn=( | .OutputValue')) ecs_service_log_group=( | .OutputValue')) envoy_log_level=" $(aws --profile "${AWS_PROFILE}" --region "${AWS_DEFAULT_REGION}" \ cloudformation describe-stacks --stack-name "flask-sample" \ | jq '.Stacks[].Outputs[]') $(echo $cluster_stack_output \ | jq -r 'select(.OutputKey == "TaskIamRoleArn") ${task_role_arn} $(echo $cluster_stack_output \ | jq -r 'select(.OutputKey == "TaskExecutionIamRoleArn") $(echo $cluster_stack_output \ | jq -r 'select(.OutputKey == "ECSServiceLogGroup") " GATEWAY_IMAGE=" ${AWS_DEFAULT_REGION} ${AWS_PROFILE} '[repositories[0].repositoryUri]' " #Gateway Task Definition task_def_json= task_def_arn= $(jq -n \ --arg APP_IMAGE $GATEWAY_IMAGE \ --arg SERVICE_LOG_GROUP $ecs_service_log_group \ --arg TASK_ROLE_ARN $task_role_arn \ --arg EXECUTION_ROLE_ARN $execution_role_arn \ --arg ENVOY_LOG_LEVEL $envoy_log_level \ -f "${DIR}/task-definition-gateway.json") $(aws --profile "${AWS_PROFILE}" --region "${AWS_DEFAULT_REGION}" \ ecs register-task-definition \ --cli-input-json "${task_def_json}" \ --query [taskDefinition.taskDefinitionArn] --output text ) The task definition for api v1 and v2 is very similar as above, you can find bash scripts in . GitHub repo/api/ Creating Services that runs the Task Definition The command to create the ECS service takes a few parameters so it is easier to use CloudFormation template as input. Let’s create a file with the following: ecs-service.yaml Parameters: EnvironmentName: Type: String Description: Environment name that joins all the stacks Default: flask AppMeshMeshName: Type: String Description: Name of mesh Default: flask-mesh ECSServicesDomain: Type: String Description: DNS namespace used by services e.g. default.svc.cluster.local Default: flask.sample GatewayTaskDefinition: Type: String Description: Task definition for Gateway Service ApiV1TaskDefinition: Type: String Description: Task definition for Api v1 ApiV2TaskDefinition: Type: String Description: Task definition for Api v2 VpcCIDR: Description: Please enter the IP range (CIDR notation) for this VPC Type: String Default: 10.0 .0 .0 /16 Resources: ECSServiceSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: "Security group for the service" VpcId: "Fn::ImportValue" : !Sub "${EnvironmentName}:VPCId" SecurityGroupIngress: - CidrIp: !Ref VpcCIDR IpProtocol: -1 ApiV1ServiceDiscoveryRecord: Type: "AWS::ServiceDiscovery::Service" Properties: Name: "api" DnsConfig: NamespaceId: "Fn::ImportValue" : !Sub "${EnvironmentName}:ECSServiceDiscoveryNamespace" DnsRecords: - Type: A TTL: 300 HealthCheckCustomConfig: FailureThreshold: 1 ApiV1Service: Type: "AWS::ECS::Service" Properties: ServiceName: "api" Cluster: "Fn::ImportValue" : !Sub "${EnvironmentName}:ECSCluster" DeploymentConfiguration: MaximumPercent: 200 MinimumHealthyPercent: 100 DesiredCount: 1 LaunchType: FARGATE ServiceRegistries: - RegistryArn: "Fn::GetAtt" : ApiV1ServiceDiscoveryRecord.Arn NetworkConfiguration: AwsvpcConfiguration: AssignPublicIp: ENABLED SecurityGroups: - !Ref ECSServiceSecurityGroup Subnets: - "Fn::ImportValue" : !Sub "${EnvironmentName}:PublicSubnetOne" - "Fn::ImportValue" : !Sub "${EnvironmentName}:PublicSubnetTwo" TaskDefinition: { Ref: ApiV1TaskDefinition } ApiV2ServiceDiscoveryRecord: Type: "AWS::ServiceDiscovery::Service" Properties: Name: "api-v2" DnsConfig: NamespaceId: "Fn::ImportValue" : !Sub "${EnvironmentName}:ECSServiceDiscoveryNamespace" DnsRecords: - Type: A TTL: 300 HealthCheckCustomConfig: FailureThreshold: 1 GatewayServiceDiscoveryRecord: Type: "AWS::ServiceDiscovery::Service" Properties: Name: "gateway" DnsConfig: NamespaceId: "Fn::ImportValue" : !Sub "${EnvironmentName}:ECSServiceDiscoveryNamespace" DnsRecords: - Type: A TTL: 300 HealthCheckCustomConfig: FailureThreshold: 1 ApiV2Service: Type: "AWS::ECS::Service" Properties: ServiceName: "api-v2" Cluster: "Fn::ImportValue" : !Sub "${EnvironmentName}:ECSCluster" DeploymentConfiguration: MaximumPercent: 200 MinimumHealthyPercent: 100 DesiredCount: 1 LaunchType: FARGATE ServiceRegistries: - RegistryArn: "Fn::GetAtt" : ApiV2ServiceDiscoveryRecord.Arn NetworkConfiguration: AwsvpcConfiguration: AssignPublicIp: ENABLED SecurityGroups: - !Ref ECSServiceSecurityGroup Subnets: - "Fn::ImportValue" : !Sub "${EnvironmentName}:PublicSubnetOne" - "Fn::ImportValue" : !Sub "${EnvironmentName}:PublicSubnetTwo" TaskDefinition: { Ref: ApiV2TaskDefinition } GatewayService: Type: "AWS::ECS::Service" Properties: ServiceName: "gateway" Cluster: "Fn::ImportValue" : !Sub "${EnvironmentName}:ECSCluster" DeploymentConfiguration: MaximumPercent: 200 MinimumHealthyPercent: 100 DesiredCount: 1 LaunchType: FARGATE ServiceRegistries: - RegistryArn: "Fn::GetAtt" : GatewayServiceDiscoveryRecord.Arn NetworkConfiguration: AwsvpcConfiguration: AssignPublicIp: ENABLED SecurityGroups: - !Ref ECSServiceSecurityGroup Subnets: - "Fn::ImportValue" : !Sub "${EnvironmentName}:PublicSubnetOne" - "Fn::ImportValue" : !Sub "${EnvironmentName}:PublicSubnetTwo" TaskDefinition: { Ref: GatewayTaskDefinition } LoadBalancers: - ContainerName: app ContainerPort: 3000 TargetGroupArn: "Fn::ImportValue" : !Sub "${EnvironmentName}:TargetGroupPublic" Next, create a bash script : ecs-services-stack.sh -ex AWS_DEFAULT_REGION= AWS_PROFILE= DIR= >/dev/null && ) flask-ecs-service /ecs-services.yaml #!/bin/bash set "YOUR_REGION" "YOUR_PROFILE" " " $( cd "$( dirname "${BASH_SOURCE[0]}" ) pwd " task_api_arn= task_api_v2_arn= task_gateway_arn= aws cloudformation --region --profile \ deploy --stack-name " $(aws ecs list-task-definitions --family-prefix api \ --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \ --sort DESC \ --query '[taskDefinitionArns[0]]' --output text) $(aws ecs list-task-definitions --family-prefix api-v2 \ --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \ --sort DESC \ --query '[taskDefinitionArns[0]]' --output text) $(aws ecs list-task-definitions --family-prefix gateway \ --region ${AWS_DEFAULT_REGION} --profile ${AWS_PROFILE} \ --sort DESC \ --query '[taskDefinitionArns[0]]' --output text) ${AWS_DEFAULT_REGION} ${AWS_PROFILE} " \ --capabilities CAPABILITY_IAM \ --template-file " ${DIR} " \ --parameter-overrides \ GatewayTaskDefinition=" ${task_gateway_arn} " \ ApiV1TaskDefinition=" ${task_api_arn} " \ ApiV2TaskDefinition=" ${task_api_v2_arn} " and run following command: $ ./ecs-services-stack.sh Now that we have setup everything we need. We can go to AWS console to review what we have created. CloudFormation Console ECS Task Definition ECS cluster and services Verify the Fargate deployment Once we have deployed our api application, we can curl the frontend service (gateway) to test. To get the endpoint, open the AWS , on the navigation pane, under , choose and select load balancer we just created, find the which is the endpoint, and run curl command: EC2 console LOAD BALANCING Load Balancers DNS name $ curl flask-Publi-xxxxx-xxxxx /todos/todo1 { : { : }, : } .ap-southeast-2 .elb .amazonaws .com "todo" "task" "build an API" "version" "1" Notice that all the services of the application are reflecting version 1. Now, it’s time for us to perform a canary deployment of api v2. Canary Deployment of Api v2 We can manage the target weight ( ) in ApiRoute resource as below: WeightedTargets app-mesh.yaml ApiRoute: Type: AWS::AppMesh::Route DependsOn: - ApiVirtualRouter - ApiV1VirtualNode - ApiV2VirtualNode Properties: MeshName: !Ref AppMeshMeshName VirtualRouterName: api-vr RouteName: api-route Spec: HttpRoute: Action: WeightedTargets: - VirtualNode: api-vn Weight: 2 - VirtualNode: api-v2-vn Weight: 1 Match: Prefix: "/" and re-deploy the template. Or we can open , choose Mesh we created, select open and edit, set target traffic weight in section: AWS App Mesh Console Virtual Routers, route Targets Integrating AWS X-Ray with AWS App Mesh AWS X-Ray helps us to monitor and analyze distributed microservice applications through request tracing, providing an end-to-end view of requests traveling through the application so we can identify the root cause of errors and performance issues. We’ll use X-Ray to provide a visual map of how App Mesh is distributing traffic and inspect traffic latency through our routes. In setting up task definition step, we have already defined X-Ray container in task definitions, however X-Ray can only run in local mode with Fargate, so we need to manually create X-ray segment to track traffic between gateway , api v1& v2. We will manually create trace id and segment, then pass the trace id and segment id as parent id to api handlers. The should look like: api-gateway/main.py flask Flask flask_restful Resource, Api, reqparse aws_xray_sdk.core xray_recorder aws_xray_sdk.core patch_all aws_xray_sdk.core.models.traceid TraceId aws_xray_sdk.core.models http requests os json logging patch_all() logging.basicConfig(level=logging.DEBUG) logger = logging.getLogger(__name__) API_ENDPOINT = os.environ[ ] SERVER_PORT = os.environ[ ] app = Flask(__name__) api = Api(app) traceid = TraceId().to_id() parser = reqparse.RequestParser() { : } self.segment = xray_recorder.begin_segment( ,traceid = traceid, sampling= ) self.segment.put_http_meta(http.URL, ) logger.info( ) xray_recorder.end_segment() logger.info( %(self.segment.id)) r = requests.get(url = %(API_ENDPOINT,SERVER_PORT), headers={ : traceid, :self.segment.id}) r.json() args = parser.parse_args() r = requests.post(url = %(API_ENDPOINT,SERVER_PORT), json=args,headers={ : traceid, :self.segment.id}) r.json(), self.segment = xray_recorder.begin_segment( ,traceid = traceid, sampling= ) self.segment.put_http_meta(http.URL, ) logger.info( ) xray_recorder.end_segment() r = requests.get(url = %(API_ENDPOINT,SERVER_PORT,todo_id),headers={ : traceid, :self.segment.id}) r.json() r = requests.delete(url = %(API_ENDPOINT,SERVER_PORT,todo_id),headers={ : traceid, :self.segment.id}) r.json(), args = parser.parse_args() task = { : args[ ]} r = requests.put(url = %(API_ENDPOINT,SERVER_PORT,todo_id), json=task,headers={ : traceid, :self.segment.id}) r.json(), api.add_resource(TodoList, ) api.add_resource(Todo, ) api.add_resource(Ping, ) __name__ == : app.run(debug= , host= , port= ) from import from import from import from import from import from import import import import import 'API_ENDPOINT' 'SERVER_PORT' # xray_recorder.configure( # sampling=False, # context_missing='LOG_ERROR', # plugins=('EC2Plugin', 'ECSPlugin'), # service='Flask Gateway' # ) : class Ping (Resource) : def get (self) return 'response' 'ok' : class TodoList (Resource) : def __init__ (self) 'gateway_todoS' 1 'gateway.flask.sample' "Request todos from gateway" : def __del__ (self) : def get (self) "Request todo from gateway, parentid is %s" '%s:%s/todos' 'x-traceid' 'x-parentid' return : def post (self) '%s:%s/todos' 'x-traceid' 'x-parentid' return 201 : class Todo (Resource) : def __init__ (self) 'gateway_todo' 1 'gateway.flask.sample' "Request todo from gateway" : def __del__ (self) : def get (self, todo_id) '%s:%s/todos/%s' 'x-traceid' 'x-parentid' return : def delete (self, todo_id) '%s:%s/todos/%s' 'x-traceid' 'x-parentid' return 204 : def put (self, todo_id) 'task' 'task' '%s:%s/todos/%s' 'x-traceid' 'x-parentid' return 201 '/todos' '/todos/<todo_id>' '/ping' if '__main__' True '0.0.0.0' 3000 After all services deployed successfully, we can open and monitor traffic we’re sending to the application frontend (gateway) when we request api application on the route. AWS X-Ray console /todos Clean it all up It is quickest to use the CloudFormation Console to delete the following stacks: flask-ecs-services flask-app-mesh flask-sample That’s about it! I hope you have found this walkthrough useful, You can find the complete project in my . GitHub repo