Infrastructure as Code is one of the cool things right now. Every DevOps-related conference in the past two years had a talk or two about the subject, and that’s a good thing.
In the wake of the DevOps movement, HashiCorp emerged as one of the most respected companies in that space. Today I’m going to talk about one of their products: Terraform.
Terraform is a tool which allows to easily manage cloud resources in a declarative way. Using a simple Programming Language, it lets you define pretty much the shape of a cloud infrastructure including VPCs, Subnets, Compute Instances, Load Balancers, DNS Records and so on. It works with every major cloud provider, but it’s not cloud-agnostic. That means you can create for example a Load Balancer in AWS or Google Cloud, but the code will be slightly different for each of them.
Blue/Green deployment is a DevOps practice that aims to reduce downtime on updates by creating a new copy of the desired component, while maintaining the current. Given that, you end with two versions of the system: One with the actual version (blue) and another with a newer one (green). When the new version is up and running, you can seamlessly switch traffic to it. This is useful not only to reduce downtime, but also to improve rollback time when something bad happens.
Example 1
Example 2
While Blue/Green deployment is a technique more commonly used with application deployment, the reduced costs of the cloud, in conjunction with the tools we have right now, make possible to have two copies of an entire cloud infrastructure with little to no pain.
It is important to note that doing Blue/Green deployment of an entire Cloud Infrastructure is not a silver bullet and certainly a bit too much if you are doing small changes (for example, adding a new EC2 Instance to your stack). But for major/breaking changes is a win and I personally recommend it.
I’ll be using Amazon Web Services for this tutorial, but the code won’t vary too much with another provider.
After finishing this, you will be able to create an infrastructure containing:
Then, you will be able to:
The full example can be seen on https://github.com/santiagopoli/terraform-examples/tree/master/blue-green
To follow this tutorial, you need to have your AWS Credentials configured in your Environment, with at least the EC2FullAccess policy attached.
I know this is a Terraform tutorial, but a recommended practice is to have a manually created VPC. You can create VPCs with Terraform, but there are a lot of external services that rely on knowing your VPC ID beforehand, so it is better to not create a new one every time on every Blue/Green deployment.
Also, you may have security groups that are created externally by another team in your organization. For that matter, we will be creating a VPC using the AWS Console. You can also create a VPC with the command line by doing:
(change the CIDR block to anything you like)
> aws ec2 create-vpc --cidr-block 10.0.0.0/16
{"Vpc": {"VpcId": "vpc-ff7bbf86","InstanceTenancy": "default","Tags": [],"CidrBlockAssociations": [{"AssociationId": "vpc-cidr-assoc-6e42b505","CidrBlock": "10.0.0.0/16","CidrBlockState": {"State": "associated"}}],"Ipv6CidrBlockAssociationSet": [],"State": "pending","DhcpOptionsId": "dopt-38f7a057","CidrBlock": "10.0.0.0/16","IsDefault": false}}
Note your VpcId, as you will need it in a second.
You can download Terraform by either going to this link or by using any Package Manager (brew, apt)
Create a new folder in your workspace and name it terraform_blue_green. Then, initialize a GIT repository, add a simple .gitignore that ignores the .terraform folder and open the folder with your favorite text editor. In my case, I’ll be using Visual Studio Code.
> mkdir terraform_blue_green> cd terraform_blue_green> git init> echo .terraform >> .gitignore> code .
Terraform stores the state of the infrastructure in a JSON File. It is recommended (and required for this tutorial) to store that file on an external backend like Amazon S3. As I’m using AWS for this Tutorial, I’ll stick to S3, but Terraform supports the equivalent in each provider.
First of all, you need to create the S3 bucket in which the state will reside. You can do this either by going to the S3 Console or by doing:
> aws s3api create-bucket --bucket terraform-bluegreen
Then, create a file named bootstrap.tf inside the project folder, with this content.
In this file we have defined
With that file in place, run this command in your project folder:
> terraform init
Initializing the backend...
Successfully configured the backend "s3"! Terraform will automaticallyuse this backend unless the backend configuration changes.
Initializing provider plugins...- Checking for available provider plugins on https://releases.hashicorp.com...- Downloading plugin for provider "aws" (1.11.0)...
Terraform has been successfully initialized!
As we’ll need the ID of the previously created VPC to do anything in our infrastructure, we will be storing it in a variable. To do this, create a file named vpc.tf with this content:
To do anything useful, we first need subnets. We will create three of them, each in a different Availability Zones. Create a file named subnets.tf, with this content:
In this file we created three subnets specifying:
With the file in place, first do:
> terraform plan
+ aws_subnet.terraform-blue-green[0]...
+ aws_subnet.terraform-blue-green[1]...
+ aws_subnet.terraform-blue-green[2]...
Plan: 3 to add, 0 to change, 0 to destroy.
The plan command does a dry run and tells you what changes will be done. It is important to plan before doing anything, as you can spot errors. In this case, the plan tells us that it will add three subnets, and that’s what we wanted.
So now we can run this:
> terraform apply
Do you want to perform these actions?Terraform will perform the actions described above.Only 'yes' will be accepted to approve.
Enter a value: yes
aws_subnet.terraform-blue-green[0]: Creating......aws_subnet.terraform-blue-green[1]: Creating......aws_subnet.terraform-blue-green[2]: Creating......aws_subnet.terraform-blue-green[0]: Creation complete after 5saws_subnet.terraform-blue-green[1]: Creation complete after 5saws_subnet.terraform-blue-green[2]: Creation complete after 5s
Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
Now, you can go to the AWS Console (under the VPC/Subnets section) and your subnets should appear
To be able to access our resources in the future, we need to create a Security Group in our VPC. For the sake of simplicity, we will be creating a Security Group that enables all inbound traffic from everywhere.
Create a file named security_groups.tf in your project, with this content:
In this file, we’ve created a security group for our VPC (using its VPC ID) and two Rules: One for Inbound traffic and one for Outbound traffic. The most important parts are:
With the file in place, run a terraform plan and terraform apply. After that, we should be able to see our security group in the AWS Console (under EC2/Security Groups).
To be able to access an AWS Instance later in the future, we need to assign an SSH Key to it.
First, create a Key Pair by using ssh-keygen:
> mkdir keypairs> ssh-keygen -f keypairs/keypair -P ""
Generating public/private rsa key pair.Your identification has been saved in keypairs/keypair.Your public key has been saved in keypairs/keypair.pub.
Given this is a tutorial, don’t bother moving the Private Key to a secure place (but you should definitely do it).
Then, create a file named keypairs.tf in the root folder of the project. Give it this content:
Then do:
> terraform plan
+ aws_key_pair.key_pair...
Plan: 1 to add, 0 to change, 0 to destroy.
> terraform apply
Terraform will perform the following actions:
+ aws_key_pair.terraform-blue-green...
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?Terraform will perform the actions described above.Only 'yes' will be accepted to approve.
Enter a value: yes
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Now, the Key appears on the AWS Console (under EC2/Key Pairs)
Create a file named instances.tf, and paste the following:
Let’s explain a little bit about this file:
We have created a resource of type aws_instance, with this parameters:
With the file in place, run:
> terraform plan
Terraform will perform the following actions:
+ aws_instance.terraform-blue-green[0]...
+ aws_instance.terraform-blue-green[1]...+ aws_instance.terraform-blue-green[2]...
Plan: 3 to add, 0 to change, 0 to destroy.
> terraform apply
Terraform will perform the following actions:
+ aws_instance.terraform-blue-green[0]...+ aws_instance.terraform-blue-green[1]...+ aws_instance.terraform-blue-green[2]...
Plan: 3 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?Terraform will perform the actions described above.Only 'yes' will be accepted to approve.
Enter a value: yes
....
Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
Outputs:
instance_public_ips = [ip1,ip2,ip3]
When the command finishes, you will be able to see the instances in your AWS Console
As you see, every of them are on different availability zones.
Accessing any of your instances (via their public IP) from a browser should display:
Create a file named load_balancers.tf with this conent:
In this file we’ve created a Load Balancer with
With the file in place, do a terraform plan and a terraform apply. When the execution ends, it should output the Load Balancer’s public dns.
> terraform apply
....
Outputs:...
load_balancer_dns = terraform-blue-green-v1-xxxxxx.us-west-2.elb.amazonaws.com
Accessing the public DNS of the Load Balancer from a browser should display the NGINX page.
Yes, I’ve used the same screenshot twice
You should also be able to see it in the AWS Console (under EC2/Load Balancers)
(Optional) Assign A DNS Record to the Load Balancer V1
I’m not going to cover too much of this case, but what I’ve ended up doing in production is creating a DNS Record that points to a specific version of the Load Balancer. An example of this in terraform could be:
Commit your changes so far
> git add .> git commit -m "Version 1"
DevOps is not all about Automation. In some cases, it’s a good practice to have a minimal human interaction. In our case, we will assign a DNS record to the desired version of the infrastructure (via the load balancer).
To be able to perform this step you’ll need to have a registered domain and the corresponding Route 53 Hosted Zone.
Enter the desired Hosted Zone and create an A Record with an Alias of the previously created load balancer (terraform-blue-green-v1…).
This is the entry point of your system and what your clients will be accessing.
First, create a new branch in your repository (and I seriously recommend removing the .terraform folder):
> git checkout -b v2> rm -rf .terraform
Now, modify bootstrap.tf with this:
As you see, you need to modify both the infrastructure_version variable and the key of the S3 Bucket. I’ll be nice if terraform allowed to interpolate the infrastructure_version variable in the key, but for now it’s not possible. There is an issue in Github though.
Now, as you deleted the .terraform folder, you need to reinitialize the state:
> terraform init
Now modify your instances.tf with this content:
(We’ve changed the instance size from t2.micro to t2.medium. You can chose whatever you like)
Doing a terraform plan will reveal that in fact, terraform will create all resources again.
> terraform plan...Plan: 11 to add, 0 to change, 0 to destroy.
After doing terraform apply, you should end with an entire new infrastructure, without changing the old one.
Instances
Subnets
Security Groups
Load Balancers
As we did previously with Version 1, point your DNS record to the new load balancer using an ALIAS.
When all traffic starts going to the new Load Balancer, it’s time to delete the Version 1 of the infrastructure.
To do this, first commit all the changes in Version 2, and then checkout the old version again. Delete the .terraform folder and initialize the state again.
> git add .> git commit -m "Version 2"> git checkout master> rm -rf .terraform> terraform init
Then, simply do:
> terraform destroy
Terraform will perform the following actions:
- aws_elb.terraform-blue-green
- aws_instance.terraform-blue-green[0]
- aws_instance.terraform-blue-green[1]
- aws_instance.terraform-blue-green[2]
- aws_key_pair.terraform-blue-green
- aws_security_group.terraform-blue-green
- aws_security_group_rule.terraform-blue-green-inbound
- aws_security_group_rule.terraform-blue-green-outbound
- aws_subnet.terraform-blue-green[0]
- aws_subnet.terraform-blue-green[1]
- aws_subnet.terraform-blue-green[2]
Plan: 0 to add, 0 to change, 11 to destroy.
Do you really want to destroy?Terraform will destroy all your managed infrastructure, as shown above.There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
...
Destroy complete! Resources: 11 destroyed.
Now, if you go to the AWS console you should see only the V2 resources. For example, this is a screenshot of the Instances after destroying the Version 1:
You can now merge the v2 branch into master. If you are doing this just for fun, please do a terraform destroy for v2 ;)
Remember that you can see this full example on https://github.com/santiagopoli/terraform-examples/tree/master/blue-green
In this guide, we used a DNS Record to select which infrastructure version is the production one. While this works most of the time, there are some cases when some client-side libraries cache DNS Entries, so you should wait some time to get the traffic to drain from the old balancer. You can solve this by maintaining a manually-created load balancer and changing its instances.
Terraform provides a clean and declarative way of defining Infrastrucure as Code. Thanks to that, we can use it to perform things that seemed impossible a few years ago.
I want to end this article by saying that this approach has a couple of downsides, and I will write an article in the future explaining how to achieve the same results by using Terraform Modules (those in fact provide better flexibility overall).
Thanks for reading!