As you may have seen, we just launched our public beta of env0 last month. As part of the run-up to the launch, our dev team had to go through and make sure everything about our infrastructure was ready for ongoing public use: one element being creating a maintenance mode for both our Application and our public API.
Why do you need a maintenance mode for something like env0, which is built to be highly available? Well whether it’s human error (as in the case of AWS S3), a DDOS attack (like with github), or just a major upgrade (like Zapier), even the most highly available applications from the most experienced providers sometimes need to be able to be taken offline for a short period of time. And when you have to do that, you want to make sure that your users understand what is happening and still have a good experience.
Here’s how our team put together our maintenance mode using Terraform, AWS, and Github Pages (including all the code at the end!)
We host all of our infrastructure on AWS, with a clear separation between Application and API:
Since we have a separation between the frontend application and our API, we need to have a maintenance mode for each of them, especially as we also have a public API that is used by our customers for integration in their CI/CD pipelines and other tools. Don’t forget this if you have multiple ways in which people access your services!
As with any project, the first step is to lay out requirements and constraints. In this case, we came up with the following list of what the solution should do:
After investigating, we came to a few conclusions:
Based on that, we implemented the following at each piece of the system — each system contains a description of what we’re doing, along with the link to a gist with the actual Terraform code.
Looking at the Cloudfront distribution code, you can see that we are creating 2 distributions, one is for the actual application and the other one is for a backdoor:
resource "aws_cloudfront_origin_access_identity" "origin_access_identity" {
comment = "${aws_s3_bucket.website_bucket.bucket} access identity"
}
resource "aws_cloudfront_distribution" "website_cdn" {
count = 2 # one for the actual cloudfront and one is for the backdoor
enabled = true
price_class = "PriceClass_200"
http_version = "http2"
default_root_object = "index.html"
origin {
origin_id = "origin-bucket-${aws_s3_bucket.website_bucket.id}"
domain_name = aws_s3_bucket.website_bucket.bucket_domain_name
s3_origin_config {
origin_access_identity = aws_cloudfront_origin_access_identity.origin_access_identity.cloudfront_access_identity_path
}
}
default_cache_behavior {
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
forwarded_values {
query_string = false
headers = []
cookies {
forward = "none"
}
}
min_ttl = 0
default_ttl = 86400 // 1 day
max_ttl = 31536000 // 1 year
target_origin_id = "origin-bucket-${aws_s3_bucket.website_bucket.id}"
viewer_protocol_policy = "redirect-to-https"
compress = true
}
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
acm_certificate_arn = count.index == 0 ? var.acm_certificate_arn : var.acm_certificate_arn_backdoor
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1.2_2018"
}
aliases = list(count.index == 0 ? var.dns_name : var.backdoor_dns_name)
}
Also in AWS, we want to ensure that Route53 is pointing in the right direction. In case we are not in maintenance mode it should point to the Cloudfront distribution, and when in maintenance mode it should point our CNAME to the github page. In either case, we should have the backdoor pointing to the Cloudfront distribution.
Additionally, we also set the TTL to be 60 seconds so it won’t take too long to move back and forth between maintenance mode and regular operation:
resource "aws_route53_record" "dns_record" {
zone_id = var.aws_route53_zone_id
name = var.dns_name
type = "CNAME"
records = list(var.maintenance_mode_enabled ? "${var.github_organization}.github.io" : aws_cloudfront_distribution.website_cdn[0].domain_name)
ttl = 60
}
resource "aws_route53_record" "dns_backdoor_record" {
zone_id = var.aws_route53_zone_id
name = var.backdoor_dns_name
type = "A"
alias {
name = aws_cloudfront_distribution.website_cdn[1].domain_name
zone_id = aws_cloudfront_distribution.website_cdn[1].hosted_zone_id
evaluate_target_health = false
}
}
⚠️ Pay attention that this Terraform code does not create the Route53 hosted zone, nor the SSL certificates — you need to complete those as appropriate for your own setup ⚠️
Next, we need a git repo containing the html files for the new maintenance mode site. So we’ve created a simple Terraform code that will create the repo as well as add all existing files in the “maintenance_mode_website” folder to the repo, which in our case is the maintenance mode html file:
resource "github_repository" "maintenance-mode-repo" {
name = var.github_repo_name
description = "Maintenance mode repo for ${var.github_repo_name}"
private = true
auto_init = true
default_branch = "master"
}
resource "github_repository_collaborator" "users_repos" {
repository = github_repository.maintenance-mode-repo.name
username = var.github_username
permission = "admin"
depends_on = [github_repository.maintenance-mode-repo]
}
resource "github_repository_file" "maintenance-mode-files" {
repository = github_repository.maintenance-mode-repo.name
for_each = {
for key, value in fileset(path.root, "maintenance_mode_website/*.*"): value => value
}
// There is a limitation here, Terraform file function supports UTF-8 files,
// so any images or other non text files can't be included in the folder.
file = trimprefix(each.value, "maintenance_mode_website/")
content = file(each.value)
depends_on = [github_repository_collaborator.users_repos]
}
The last part is the trickiest to implement in Terraform, because Github pages configuration is not actually part of the github Terraform provider, which means that I can’t really configure it with the Terraform code. However because github offers an API to configure github pages, it can be done programmatically via the API. In our case, we can use the env0 custom flows feature to trigger those API calls once the deploy is finished:
version: 1
deploy:
steps:
terraformOutput:
after:
- scripts/set_maintanance_mode.sh ${TF_VAR_github_organization} "$(terraform output github_repo_name)" "$(terraform output website_endpoint)" ${TF_VAR_github_username} ${TF_VAR_github_token}
#!/usr/bin/env bash
set -e
REPO_ORGANIZATION=$1
REPO_NAME=$2
CNAME=$3
GITHUB_USERNAME=$4
GITHUB_TOKEN=$5
if [[ ${REPO_NAME} != "" && CNAME != "" ]]; then
echo "Going to set environment to maintenance mode, repo name: ${REPO_NAME}, CNAME: ${CNAME}"
GIT_AUTH="${GITHUB_USERNAME}:${GITHUB_TOKEN}"
curl --user "${GIT_AUTH}" \
--request POST \
--header "Content-Type: application/json" \
--header "Accept: application/vnd.github.switcheroo-preview+json" \
--url "https://api.github.com/repos/${REPO_ORGANIZATION}/${REPO_NAME}/pages" \
--data "{ \"source\": { \"branch\": \"master\" } }"
curl --user "${GIT_AUTH}" \
--request PUT \
--header "Content-Type: application/json" \
--url "https://api.github.com/repos/${REPO_ORGANIZATION}/${REPO_NAME}/pages" \
--data "{ \"cname\": \"${CNAME}\", \"source\": \"master\" }"
else
echo "No need to set environment to maintenance mode"
fi
Now that our system is all configured, all I have to do is change the Terraform variable of the maintenance mode to be true/false and deploy the environment (in our case via the env0 UI).
The complete template source code can be found in this github repo which includes all the Terraform code, scripts, our env0.yml and the maintenance page html file. We hope you find this useful, or get other ideas for more ways to use Terraform for your deployment workflows. Our next blog post will give you a sneak preview on how we are creating a maintenance mode on our API using Terraform.
env0 lets your team manage their own environments in AWS, Azure and Google, governed by your policies and with complete visibility & cost management. You can learn more about env0 here and you can also try it out yourself. Feel free to drop us your thoughts below!