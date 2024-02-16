Right after my first try of Kamal (MRSK) in the spring of 2023, I understood that an ideal use case would be running it as a GitHub Action. Almost a year passed, and my 30-line action has grown and become full-featured, configurable, and reusable. In this post, I will share the evolution of the action and the lessons learned.





Before we start, let me remind you what Kamal is. Kamal is a Ruby library created by 37signals to orchestrate the deployment of Docker containers. Before switching to Kamal, I had a bunch of scripts and technologies to deploy my applications. Kamal allowed me to simplify the deployment process and make it more reliable. Also, there were already some GitHub workflows for running lints and tests.





In this article, I will focus on deployment.





First Try

In an older article about Kamal, I posted the first version of the GitHub Action. It was a simple action that used the ruby/setup-ruby action to install Ruby, then webfactory/ssh-agent to configure the SSH agent, prepared AWS credentials and then run the kamal envify and kamal deploy command.





The code is below.

name: Kamal on: push: branches: - main jobs: spec: uses: ./.github/workflows/specs.yml lint: uses: ./.github/workflows/lint_code.yml build_and_deploy: needs: [spec, lint] runs-on: ubuntu-latest timeout-minutes: 20 outputs: image: ${{ steps.build.outputs.image }} env: RAILS_ENV: production steps: - uses: actions/checkout@v3 with: ref: ${{ github.event.workflow_run.head_branch }} - uses: webfactory/[email protected] with: ssh-private-key: ${{ secrets.SSH_PRIVATE_KEY }} - uses: ruby/setup-ruby@v1 env: BUNDLE_GEMFILE: ./kamal/Gemfile with: ruby-version: 3.2.2 bundler-cache: true - name: Set up Docker Buildx uses: docker/setup-buildx-action@v2 with: driver-opts: image=moby/buildkit:master - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v1 with: aws-access-key-id : ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region : us-east-1 mask-aws-account-id : 'true' - name: Login to Amazon ECR id : login-ecr uses: aws-actions/amazon-ecr-login@v1 - name: Kamal Envify id : kamal-envify env : KAMAL_REGISTRY_PASSWORD: ${{ steps.login-ecr.outputs.docker_password_YOUR_AWS_ACCOUNT_ID_dkr_ecr_YOUR_AWS_REGION_amazonaws_com }} DATABASE_URL: ${{ secrets.DATABASE_URL }} REDIS_URL: ${{ secrets.REDIS_URL }} RAILS_MASTER_KEY: ${{ secrets.RAILS_MASTER_KEY }} DOCKER_BUILDKIT: 1 BUNDLE_GEMFILE: ./kamal/Gemfile run: | ./bin/kamal envify - name: Kamal Deploy id: kamal-deploy run: | ./bin/kamal deploy





There are several things to pay attention to. First, at lines 27-29, I set up the SSH agent to be able to connect to the instances and run Docker commands. The SSH private key is stored in the GitHub secrets, which is a convenient way to store sensitive data.





On lines 38-41, I set up Docker Buildx, and it was very important to download Buildkit from the repository master branch. At the moment of spring-winter of 2023, Buildkit didn’t support cache manifests for AWS ECR, and I had to use the master branch to get the feature (Issue #876).





On lines 43-53, I configured AWS credentials and logged in to Amazon ECR. It is not the most secure way to login to AWS, but was the easiest way to do it at that time.

This setup had several drawbacks.





The first and the most important, the action was not reusable. To configure deployment to another environment, I had to copy-paste the action completely.

Besides that, the action didn’t provide the ability to run deploy without running specs and lints. It was not a big deal, but sometimes I wanted to run deploy separately.

The action didn’t provide the ability to run deploy to another branch.

Sometimes I wanted to restart the Traefik container. This could be done from the local machine, but for me, it would be better to do it from the action.





Action’s File Structure

We incrementally improved the action, added new features, and made it more configurable. Finally, we come to the set of actions that can be used in almost any project.





For a better understanding of the changes, let’s look at the directory structure of the actions first.

.github/ ├─ workflows/ ├─ build_deploy/ ├─ action.yaml ├─ pre-build/ ├─ action.yaml ├─ 01.build_deploy_production.yaml ├─ 02.build_deploy_staging.yaml ├─ 03.database_backup.yaml ├─ 04.build_deploy_manually.yaml ├─ 05.validate_pull_request.yaml ├─ 06.kamal_run_command.yaml ├─ _lint.yaml ├─ _specs.yaml





You may notice two directories, build_deploy and pre-deploy , both have file action.yaml inside. These are composite actions that include all the necessary steps to build and deploy the application. Also, some workflows are named with a leading underscore. These are reusable workflows that consist of several jobs and steps.





How are Composite Actions different from Reusable Workflows?

Composite Actions allow you to bundle multiple existing workflow steps into a single action.

A Composite Action cannot be used without a repo checkout while Reusable Workflows can be used without a checkout.

A Reusable Workflow can include multiple jobs and multiple steps within those jobs. However, Composite Actions can only have one job.

Reusable Workflow can use Secrets by declaring them to a workflow via parameters while Composite Actions cannot use Secrets in a flexible way.





Files with numeric prefixes are main workflow definitions that call the reusable workflows and composite actions. They cover the most common usecases for a modern Rails project: deploy to production and staging, database backup, manual deploy, pull request validation, and Kamal command execution.

Pre Build Action

Let’s start with the pre-build action. It is a composite action that includes all the necessary steps to prepare the environment for the build and deploy action.





The file is below.

# pre-build/action.yml name: Pre-Build inputs: database-url: type: string redis-url: type: string rails-master-key: type: string aws_role_access: type: string ssh_private_key: type: string environment: type: string runs: using: composite steps: - uses: webfactory/[email protected] with: ssh-private-key: ${{ inputs.ssh-private-key }} - uses: ruby/setup-ruby@v1 env: BUNDLE_GEMFILE: ./Gemfile with: ruby-version: .ruby-version bundler-cache: true - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: aws-cred-configure uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume : ${{ inputs.aws-role-access }} role-session-name: samplerolesession aws-region : es-east-1 mask-aws-account-id: 'true' - name: login-to-aws-ecr id: login-ecr uses: aws-actions/amazon-ecr-login@v2 with: mask-password: 'true' - name: Kamal Envify shell: bash id : kamal-envify env : KAMAL_REGISTRY_PASSWORD: ${{ steps.login-ecr.outputs.docker_password_AWS_ACCOUNT_ID_dkr_ecr_eu_west_2_amazonaws_com }} DATABASE_URL: ${{ inputs.database-url }} REDIS_URL: ${{ inputs.redis-url }} RAILS_MASTER_KEY: ${{ inputs.rails-master-key }} DOCKER_BUILDKIT: 1 run: | ./bin/kamal envify --destination=${{ inputs.environment }}





Let’s walk line by line through the file. Lines 5-17 describe the inputs of the action and are not interesting to us. The first step is to set up an SSH agent, and it is the same as in the first version of the action. However, as you may see, it uses ssh-private-key from the action inputs, which allows us to use the action in different environments. After that, I setup Ruby, and it is also the same as in the first version of the action.





The Docker Buildx setup step differs from the initial workflow definition. It has been updated to version 3 and now supports AWS ECR image cache out the box, so we don’t need to define driver-opts anymore.





The next step is to configure AWS credentials. It is completely different from what I initially had. Instead of access-key-id and secret-access-key authentication, I switched to the UIDC role-based authentication, which is more secure and GitHub advises to use it. If you need a more detailed explanation of how to configure OpenID Connect in AWS, I suggest you read this excellent guide. After the authentication is done, I login to Amazon ECR on lines 44-48.





The final step is to run kamal-envify , which prepares environment variables for the deployment. The command is the same as in the first version of the action, but I added the --destination flag to the command, which allows me to deploy to different environments.





Build & Deploy Action

The next composite action is defined in build_deploy folder and it is relatively simple.

# build-deploy/action.yml name: Build & Deploy inputs: environment: type: string runs: using: composite steps: - name: Kamal Deploy shell: bash id: kamal-deploy run: | ./bin/kamal deploy --destination=${{ inputs.environment }} - name: Kamal Release shell: bash if: ${{ cancelled() }} run: | ./bin/kamal lock release --destination=${{ inputs.environment }}





Since all preparations are done in pre-build , when this action starts, I am ready to run the kamal deploy command with the selected environment.





Kamal creates a lock file before starting the build and deployment process. Usually, lock is released when deployment is finished, but if the deployment is canceled, the lock is not released. And the next workflow run will fail. To avoid this, I added the kamal lock release command to the action. Later this article, we will use this small hack to handle concurrent deployments correctly.

Workflow Definitions

The main workflow definitions are very simple. They just call the composite action together with reusable workflows and pass the necessary parameters. Below is an example of the 01.build_deploy_production.yaml file.





# 01.build_deploy_production.yaml name: 01. Build & Deploy Production permissions: id-token: write contents: read on: release: types: [published] jobs: spec: uses: ./.github/workflows/_specs.yaml secrets: inherit lint: uses: ./.github/workflows/_lint_code.yaml secrets: inherit build_and_deploy: name: build-deploy-production concurrency: group: production_environment cancel-in-progress: true environment: name: production url: https://onetribe.team needs: - spec - lint runs-on: ubuntu-latest timeout-minutes: 20 steps: - uses: actions/checkout@v4 with: ref: ${{ github.event.workflow_run.head_branch }} - name: Pre Build uses: ./.github/workflows/pre-build with: database-url: ${{ secrets.DATABASE_URL_PRODUCTION_ADMIN }} redis-url: ${{ secrets.REDIS_URL_PRODUCTION_ADMIN }} rails-master-key: ${{ secrets.RAILS_MASTER_KEY }} aws-role-access: ${{ secrets.AWS_ROLE_ACCESS }} ssh-private-key: ${{ secrets.SSH_PRIVATE_KEY }} environment: production - name: Build & Deploy uses: ./.github/workflows/build-deploy with: environment: production





The concurrency section in lines 23-25 deserves attention in the file. It allows me to run only one deployment at a time. GitHub has a great documentation section that covers all possible use cases. If a deployment is already running, it will be canceled, and the "Kamal Release" step from the previous workflow run will be executed. This is an essential feature because it allows me to avoid concurrent deployments and handle them correctly.





Staging deploy is defined in the 02.build_deploy_staging.yaml file and is similar to production, except for the event that starts the workflow: for staging deploy I use the push event to GIT staging branch, instead of the release .





# 02.build_deploy_staging.yaml name: 02. Build Staging permissions: id-token: write contents: read on: push: branches: - staging # ...





In this article, I will not cover database backup workflow, as defined in the 03.database_backup.yaml file, because it is not related to the theme of the article. However, let’s look at 04.deploy_manually.yaml , 05.validate_pull_request.yaml , and 06.kamal_run_command.yaml files.





The 04.build_deploy_manually.yaml file is below.

# 04.build_deploy_manually.yaml name: 04. Deploy Manually permissions: id-token: write contents: read on: workflow_dispatch: inputs: environment: description: 'Environment' required: true default: 'staging' type: choice options: - production - staging jobs: build-production: name: deploy-production concurrency: group: production_environment cancel-in-progress: true environment: name: production url: https://onetribe.team if: ${{ github.event.inputs.environment == 'production' }} runs-on: ubuntu-latest timeout-minutes: 20 steps: - uses: actions/checkout@v4 with: ref: ${{ github.event.workflow_run.head_branch }} - name: Pre Build uses: ./.github/workflows/pre-build with: database-url: ${{ secrets.DATABASE_URL_PRODUCTION_ADMIN }} redis-url: ${{ secrets.REDIS_URL_PRODUCTION_ADMIN }} rails-master-key: ${{ secrets.RAILS_MASTER_KEY }} aws-role-access: ${{ secrets.AWS_ROLE_ACCESS }} ssh-private-key: ${{ secrets.SSH_PRIVATE_KEY }} environment: production - name: Build and Deploy uses: ./.github/workflows/build-deploy with: environment: production build-staging: name: deploy-staging concurrency: group: staging_environment cancel-in-progress: true # ... # staging deploy is similar to production, described above and I will not show it completely.





Pull request validation is defined in the 05.validate_pull_request.yaml , it is the most small and simple workflow.





It runs specs and lints, triggered by the pull_request event and also can be triggered manually.

# 05.validate_pull_request.yaml name: 05. Validate Pull Request permissions: id-token: write contents: read on: pull_request: workflow_dispatch: jobs: spec: uses: ./.github/workflows/_specs.yaml secrets: inherit lint: uses: ./.github/workflows/_lint_code.yaml secrets: inherit





The last workflow that I want to cover in this article is the 06.kamal_run_command.yaml file. Sometimes I need to restart the Traefik container or maybe start or stop accessory containers. I can do it from the local machine, but this requires environment setup and is not always

convenient.





This workflow allows me to run any command from the list of predefined commands.

name: 06. Kamal run command permissions: id-token: write contents: read on: workflow_dispatch: inputs: command: description: 'Commands' required: true type: choice options: - traefik reboot --rolling - accessory reboot pg_hero environment: description: 'Environment' required: true type: choice options: - staging - production jobs: kamal_run_command: name: Kamal run command runs-on: ubuntu-latest timeout-minutes: 20 concurrency: group: ${{ github.event.inputs.environment }}_environment cancel-in-progress: false environment: name: ${{ github.event.inputs.environment }} steps: - uses: actions/checkout@v4 with: ref: ${{ github.event.workflow_run.head_branch }} - uses: ./.github/workflows/pre-build name: Pre Build with: database-url: ${{ github.event.inputs.environment == 'production' && secrets.DATABASE_URL_PRODUCTION || secrets.DATABASE_URL_STAGING }} redis-url: ${{ github.event.inputs.environment == 'production' && secrets.REDIS_URL_PRODUCTION || secrets.REDIS_URL_STAGING }} rails-master-key: ${{ secrets.RAILS_MASTER_KEY }} aws-role-access: ${{ secrets.AWS_ROLE_ACCESS }} ssh-private-key: ${{ secrets.SSH_PRIVATE_KEY }} environment: ${{ github.event.inputs.environment }} - name: kamal ${{ github.event.inputs.command }} --destination=${{ github.event.inputs.environment }} run: | ./bin/kamal ${{ github.event.inputs.command }} --destination=${{ github.event.inputs.environment }}





Conclusion

The action has grown from a simple 30-line action to a set of reusable workflows and composite actions. It is now full-featured, configurable, and reusable. It allows me to run deploy to different environments, run deploy without running specs and lints, and restart Traefik containers and accessories.





I have used this or a similar setup of workflows for about six months and what can I say? It covers all my needs and can be easily adapted for any new features. I hope this article will help you to build your action and workflows. If you have any questions, feel free to ask me in the comments.





