I have a confession to make… I commit to master.

I used to preach about Git Flow to keep my code releasable, rollback-able, and keep a clean history. But not anymore — now, bad code doesn’t make it into my codebase.

This is because I have a robust continuous deployment pipeline.

Having this pipeline allows me and my team members to commit directly to master.

I can hear the pitchforks being sharpened already. “NEVER commit to master!”

I know, I know, I used to say the same thing. “You need to have branches for keeping your code organized, and be able to review it!” But like, who wants to do that? Automate it!

Imagine it. Every commit that you make is linted, small errors are automatically fixed, your code is thoroughly tested, and your coverage thresholds are enforced. The process is completed in a brand new container, ensuring no global dependencies are forgotten. Then, tests are executed against the finished container to prove it integrates properly with dependencies such as databases, message queues, and other any other services (integration tests). If all goes well, the code is successfully pushed to GitHub, where my Continuous Deployment server picks it up and finishes the job.

If it makes it to deployment, I’m positive it works. It’s been well tested.

I’m sure you’re thinking at this point “sounds like a lot of work”.

Luckily for you I’ve put in months of trial and error, and now I’d like to present to you, the process and techniques I use day-to-day to confidently commit to

master

. By the end of this post you will have all of the skills required to build your own production optimized Dockerized Continuous Integration process, and run it on every commit by making use of Docker Compose and Docker Multi-Stage builds.

master

will always be safe to deploy.

NOTE: If you’re a Node.js engineer, you’re in luck! Examples are for Node. For everyone else, the concepts and techniques are the same. You should be able to extrapolate how the process should work for your languages of choice.

But first, a little about the different stages of Continuous deployment.

Continuous Integration vs. Continuous Delivery vs. Continuous Deployment

Often these terms are used interchangeably, but that is not correct. In the chart below I’ve defined which tasks fit into which areas.

For a while now, I’ve been advocating usage of the docker-compose builder pattern for continuous integration pipelines. With the advent of Docker multi-stage builds, however, it’s now easier to get smaller and more efficient containers.

Docker is essentially, an isolated environment for your code to run in. Just like you would provision a server, you provision a docker container. Most popular frameworks, and software have builds available from Docker Hub. Seems how we are using Node, we need an environment that runs

node

. We’ll start our Dockerfile with that.

# Dockerfile
FROM node:9-alpine AS build

Note the

AS

directive. This signals that this is not the final stage of the Dockerfile. Later on we can

COPY

artifacts out of this stage into our final container. Let’s move on.

For complication’s sake, let’s say we are using a library which requires node-gyp to install dependencies properly because it needs to compile native

c++

binaries for the OS you are running on. In most cases you won’t need this, but some popular libraries, like

redis

require it.

# Dockerfile continued
# optionally install gyp tools
RUN apk add --update --no-cache \
    python \
    make \
    g++

That’s probably about as complicated as a

node

environment will get.

Less than a year ago, I would have told you to build this image and push it to a Docker registry to use as a base image for other

node-gyp

related builds, and in fact, I did. However, with the advent of Docker multi-stage builds, the extra step is no longer necessary. In fact I would actually say it not recommended anymore due to the requirement of keeping multiple images up to date. Instead let’s just continue on with our pipeline, and make use of multi-stage builds to productionize our build later on. First, we need to define what our pipeline is in the context of our application.

Defining the Pipeline

Again, to have a realistic example, I’ll assume we are using babel as preprocessor, eslint as a linter, and jest as a testing tool. However, the pipeline will run just by calling npm scripts, so it should be pretty easy to substitute the tools you are using, like TypeScript, or etc.

Here is a sample

scripts

section of a package.json file using those tools:

// package.json
"scripts": {
  "start": "nodemon src/index.js --watch src node_modules --exec babel-node",
  "build": "babel src -d dist",
  "serve": "node dist/index.js",
  "lint": "eslint src __tests__",
  "lint:fix": "eslint --fix src __tests__",
  "test": "NODE_ENV=test jest --config jest.json --coverage",
  "test:staging": "jest --config jest.staging.json --runInBand",
  "test:watch": "NODE_ENV=test jest --config jest.json --watch --coverage",
}

In our CI process we want to cover linting, testing, and building our app, as well as building the container, and testing the container using our staging tests. By continuing on in our Dockerfile, we can cover everything except the staging tests, which requires the built container as input.

# Dockerfile continued 
ADD . /src
WORKDIR /src
RUN npm install
RUN npm run lint
RUN npm run test
RUN npm run build
RUN npm prune --production

So, we are ADDing our source code into the container, to a folder called

/src

, and then changing our

WORKDIR

to that

/src

directory which now contains our code. Next, we are simply running the appropriate

npm

scripts to

install

dependencies,

lint

our code,

test

our code, compile our code with

build

, and then removing devDependencies with

npm prune --production

Before we continue, I want to talk about the test step a little more, as that is also set up to measure coverage because we used the

--coverage

flag. We also passed in a

jest.json

file as a config. This is where coverage thresholds are defined.

// jest.json
{
  "testEnvironment": "node",
  "modulePaths": [
    "src",
    "/node_modules/"
  ],
  "coverageThreshold": {
    "global": {
      "branches": 100,
      "functions": 100,
      "lines": 100,
      "statements": 100
    }
  },
  "collectCoverageFrom" : [
    "src/**/*.js"
  ]
}

If you wanted to maintain 90% coverage, you would decrease each option marked as

. The tests will fail if the coverage threshold is not met.

If you want format your code automatically the exact same way I do, here’s the .eslintrc file I use with ESLint. And, just to make your life easier, the .babelrc file I use, with Babel.

Second Stage of Build

Our Dockerfile up until this point now starts with a new Node evironment, on Alpine linux, optionally installs node-gyp tools, and then adds, lints, tests, and compiles our code, and then finally prunes away development dependencies. What we have left are all of the artifacts we need for a production build. Unfortunately we are left with additional bloat from the tools we needed to get this far. We will use a multi-stage build to copy only the artifacts we need into our final productionized container, using

COPY --from=build

# Dockerfile continued
FROM node:9-alpine
# install curl for healthcheck
RUN apk add --update --no-cache curl
ENV PORT=3000
EXPOSE $PORT
ENV DIR=/usr/src/service
WORKDIR $DIR
# Copy files from build stage
COPY --from=build /src/package.json package.json
COPY --from=build /src/package-lock.json package-lock.json
COPY --from=build /src/node_modules node_modules
COPY --from=build /src/dist dist
HEALTHCHECK --interval=5s \
            --timeout=5s \
            --retries=6 \
            CMD curl -fs http://localhost:$PORT/_health || exit 1
CMD ["node", "dist/index.js"]

This completes our Dockerfile. The final size in my case is a 28MB container, which has my production

node_modules

, and my

dist

folder of babel-compiled javascript source code. To run, simply use vanilla

node

. I’ve also defined a health check that a scheduler like Docker Swarm can use to ensure the container is healthy.

curl

may not be the most efficient healthcheck, but it’s a good starting point.

Here’s the full Dockefile in one place for all of your copy and pasting needs!

FROM node:9-alpine AS build
# install gyp tools
RUN apk add --update --no-cache \
        python \
        make \
        g++
ADD . /src
WORKDIR /src
RUN npm install
RUN npm run lint
RUN npm run test
RUN npm run build
RUN npm prune --production
FROM node:9-alpine
RUN apk add --update --no-cache curl
ENV PORT=3000
EXPOSE $PORT
ENV DIR=/usr/src/service
WORKDIR $DIR
COPY --from=build /src/package.json package.json
COPY --from=build /src/package-lock.json package-lock.json
COPY --from=build /src/node_modules node_modules
COPY --from=build /src/dist dist
HEALTHCHECK --interval=5s \
            --timeout=5s \
            --retries=6 \
            CMD curl -fs http://localhost:$PORT/_health || exit 1
CMD ["node", "dist/index.js"]

Now, simply running:

docker image build -t your-image-name .

will run 80% of our CI pipeline. Next step is testing the container with it’s integrations.

Integration Tests

For integration tests, we need to run other software, like databases, or message queues, or other services within our system, and test that they work together – that they integrate. Because this task requires running multiple images together, we will instead use

docker-compose

which is suited to this type of task.

Again, trying to stick with realistic, instead of over simplified examples, here is the docker-compose file I use for testing a more complicated microservice in an architecture based on CQRS and Event Sourcing, which has dependencies on

redis

mongodb

, and

rabbitmq

version: '2'
services:
  staging-deps:
    image: your-image-name
    environment:
      - NODE_ENV=production
      - PORT=3000
      - RABBITMQ_URL=amqp://rabbitmq:5672
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - MONGO_URL=mongodb://mongo:27017/inventory
      - DEBUG=servicebus*
    networks:
      - default
    depends_on:
      - redis
      - rabbitmq
      - mongo
  rabbitmq:
    image: rabbitmq:3.6-management
    ports:
      - 15672:15672
    hostname: rabbitmq
    networks:
      - default
  redis:
    image: redis
    networks:
      - default
  mongo:
    image: mongo
    ports:
      - 27017:27017
    networks:
      - default
  staging:
    image: node:8-alpine
    volumes:
      - .:/usr/src/service
    working_dir: /usr/src/service
    networks:
      - default
    environment:
      - apiUrl=http://staging-deps:3000
      - RABBITMQ_URL=amqp://rabbitmq:5672
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - MONGO_URL=mongodb://mongo:27017/inventory
      - DEBUG=$DEBUG
    command: npm run test:staging
  clean:
    extends:
      service: staging
    command: rm -rf node_modules
  install:
    extends: 
      service: staging
    command: npm run install

Pay particular attention to

staging-deps

and

staging

services.

staging-deps

is running the image that was produced from running

docker image build

command we ran earlier. This happens by setting image: to the tag we set in the build command with

-t

. We are passing it a bunch of environment variables to let our service know how to connect to the different containers running in our network, which is the

default

network. Each docker-compose file can define networks, and has a

default

network by default.

Docker also handles service discovery through it’s Software Defined Networks (SDNs), so the host name will resolve to the IP address of the container with the same name on the network. For example, in

MONGO_URL=mongodb://mongo:27017/inventory mongo

will resolve to the mongo container in the SDN. Lastly,

depends_on

will tell docker to start the depended on containers first when bringing them up.

staging

is a container based on

node

and with our source code mounted into it using the

volumes

directive. The integration tests will be run from this container.

Additionally, there are two more “services” that we will also run as scripts:

clean

and

install

clean

to ensure that we are using the correct versions of

node_modules

seems how we may be switching between developing on our host OS such as Mac, and Alpine Linux, which our containers are based on.

First we will bring up the dependencies:

docker-compose -f docker-compose.staging.yml up -d staging-deps

Depending on whether or not your service accounts for retrying connections, you may want to simply start certain containers first, and sleep while they become ready. Alternatively, scripts like

wait-for-it.sh

can be used to accomplish this with a shell script.

Once your dependencies are up and running, you’ll need to run the staging tests. We’ve mounted the code into the container, but haven’t installed npm dependencies on it. Run the following to install dependencies for the linux container, and then run staging tests.

docker-compose -f docker-compose.staging.yml run --rm install
docker-compose -f docker-compose.staging.yml run --rm staging

If your integration/staging tests pass, you’ll be confident you can continue on with the delivery and deployment stages! But first, let’s automate running our pipeline, so we can run it on each commit.

Automating the pipeline

First, let’s create a

Makefile

. Our goal is to be able to run

make ci

to run the whole CI process.

ci:
  make docker-build \
     clean \
     install \
     staging \
     staging-down
docker-build:
  docker build -t your-image-name .
clean:
  docker-compose -f docker-compose.staging.yml run --rm clean
install:
  docker-compose -f docker-compose.staging.yml run --rm install
staging:  
  docker-compose -f docker-compose.staging.yml up -d staging-deps
  docker-compose -f docker-compose.staging.yml run --rm staging
staging-down:
  docker-compose -f docker-compose.staging.yml down

Running on every commit

I run on every commit locally, using a npm package called

husky

and again on my CI server, which typically is Jenkins 2.0 Pipelines.

To run locally let’s start by installing

husky

to our devDependencies by running

npm i --save-dev husky

Husky makes it super simple to install and run git hooks. Simply open up your package.json and add the following two scripts:

"scripts": {
  // ...
  "precommit": "npm run lint:fix && npm run test",
  "prepush": "make ci"
},

Now on every commit, on every developer’s machine, husky will fix all linting errors that can be automatically fixed, and run unit tests. Our tests are configured to have 100% coverage, so if any tests do not pass, or do not meet coverage thresholds, the commit will be blocked. Additionally, on every attempt to push, the entire CI process we defined will be run. This is useful for catching any deps that may have been installed locally but not saved to package.json, and ensures that everything passes when running in the OS it will be hosted on in production, as well as testing the container that is built using staging tests.

Conclusion

At this point you have everything you need to run an entire Dockerized Continuous Integration process on every commit, and prevent bad code from entering your code base. But this is far from the end of the journey. Follow me for more posts about similar topics!

As for next steps:

1. Improving Continuous Integration

Build from this point to customize for your needs. If you are making an app, it might make sense to also run end-to-end tests with Selenium, or one of it’s flavors. If it’s a high load site, add stress testing to the pipeline. Any types of tests like these belong in the integration phase.

2. Continuous Delivery

Once you are building production-ized docker containers, it’d be a shame not to put them in a registry. There are paid solutions for this, and self hosted registry’s. What works best for you is up to your situation. I typically just pay for private repos on Docker Hub and call it a day.

3. Continuous Deployment

This is where the magic truly begins to happen. When you have fully tested production optimized containers deploying themselves on every commit to master, it’s truly zen. Especially given modern microservice architectures. Typically, we are adding new services to add new features, rather than a branch, making you miss branches even less. For Continuous Deployment servers, I prefer Jenkins 2.0 Pipelines with Blue Ocean.

Jenkinsfile

's allow the engineer’s of each product to define their own CD pipeline’s, and Docker Stack files allow the developers to define how their services will run in production. A CD process might also involve updating a proxy, such as HAProxy or nginx.

4. Running in Production

This is what orchestrators are for. These are tools like Docker Swarm, Kubernetes, and Mesos. The work very similarly to schedulers on the computer you are using now, that schedule processes and allocate resources that run on your machine. Instead, they are scheduling and reserving resources across a cluster of machines, letting you control them as if they were one.

Interested in hearing MY DevOps Journey, WITHOUT useless AWS Certifications? Read it now on HackerNoon.

Thanks so much for reading! Please smash that clap button and share if you found this post helpful!

Make sure to follow me for more posts on related topics! :)

Changelog
11/21/2017, 9:11 PM
“Refactored” the introduction and conclusion to address comments and questions, and improve clarity.

Thanks to Wyatt McBain.