Patrick Lee Scott

Read my story at <a href="https://patscott.io/home">https://patscott.io/home</a>

I have a confession to make… I commit to master.

I used to preach about Git Flow to keep my code releasable, rollback-able, and keep a clean history. But not anymore — now, bad code doesn’t make it into my codebase.

This is because I have a robust continuous deployment pipeline.

Having this pipeline allows me and my team members to commit directly to master.

I can hear the pitchforks being sharpened already. “NEVER commit to master!”

I know, I know, I used to say the same thing. “You need to have branches for keeping your code organized, and be able to review it!” But like, who wants to do that? Automate it!

Imagine it. Every commit that you make is linted, small errors are automatically fixed, your code is thoroughly tested, and your coverage thresholds are enforced. The process is completed in a brand new container, ensuring no global dependencies are forgotten. Then, tests are executed against the finished container to prove it integrates properly with dependencies such as databases, message queues, and other any other services (integration tests). If all goes well, the code is successfully pushed to GitHub, where my Continuous Deployment server picks it up and finishes the job.

If it makes it to deployment, I’m positive it works. It’s been well tested.

I’m sure you’re thinking at this point “sounds like a lot of work”.

Luckily for you I’ve put in months of trial and error, and now I’d like to present to you, the process and techniques I use day-to-day to confidently commit to master. By the end of this post you will have all of the skills required to build your own production optimized Dockerized Continuous Integration process, and run it on every commit by making use of Docker Compose and Docker Multi-Stage builds.

master will always be safe to deploy.

NOTE: If you’re a Node.js engineer, you’re in luck! Examples are for Node. For everyone else, the concepts and techniques are the same. You should be able to extrapolate how the process should work for your languages of choice.

But first, a little about the different stages of Continuous deployment.

Continuous Integration vs. Continuous Delivery vs. Continuous Deployment

Often these terms are used interchangeably, but that is not correct. In the chart below I’ve defined which tasks fit into which areas.

For a while now, I’ve been advocating usage of the docker-compose builder pattern for continuous integration pipelines. With the advent of Docker multi-stage builds, however, it’s now easier to get smaller and more efficient containers.

Docker is essentially, an isolated environment for your code to run in. Just like you would provision a server, you provision a docker container. Most popular frameworks, and software have builds available from Docker Hub. Seems how we are using Node, we need an environment that runs node. We’ll start our Dockerfile with that.

# Dockerfile
FROM node:9-alpine AS build

Note the AS directive. This signals that this is not the final stage of the Dockerfile. Later on we can COPY artifacts out of this stage into our final container. Let’s move on.

For complication’s sake, let’s say we are using a library which requires node-gyp to install dependencies properly because it needs to compile native c++ binaries for the OS you are running on. In most cases you won’t need this, but some popular libraries, like redis require it.

# Dockerfile continued
# optionally install gyp tools
RUN apk add --update --no-cache \
python \
make \
g++

That’s probably about as complicated as a node environment will get.

Less than a year ago, I would have told you to build this image and push it to a Docker registry to use as a base image for other node-gyp related builds, and in fact, I did. However, with the advent of Docker multi-stage builds, the extra step is no longer necessary. In fact I would actually say it not recommended anymore due to the requirement of keeping multiple images up to date. Instead let’s just continue on with our pipeline, and make use of multi-stage builds to productionize our build later on. First, we need to define what our pipeline is in the context of our application.

Defining the Pipeline

Again, to have a realistic example, I’ll assume we are using babel as preprocessor, eslint as a linter, and jest as a testing tool. However, the pipeline will run just by calling npm scripts, so it should be pretty easy to substitute the tools you are using, like TypeScript, or etc.

Here is a sample scripts section of a package.json file using those tools:

// package.json
"scripts": {
"start": "nodemon src/index.js --watch src node_modules --exec babel-node",
"build": "babel src -d dist",
"serve": "node dist/index.js",
"lint": "eslint src __tests__",
"lint:fix": "eslint --fix src __tests__",
"test": "NODE_ENV=test jest --config jest.json --coverage",
"test:staging": "jest --config jest.staging.json --runInBand",
"test:watch": "NODE_ENV=test jest --config jest.json --watch --coverage",
}

In our CI process we want to cover linting, testing, and building our app, as well as building the container, and testing the container using our staging tests. By continuing on in our Dockerfile, we can cover everything except the staging tests, which requires the built container as input.

# Dockerfile continued 
ADD . /src
WORKDIR /src
RUN npm install
RUN npm run lint
RUN npm run test
RUN npm run build
RUN npm prune --production

So, we are ADDing our source code into the container, to a folder called /src, and then changing our WORKDIR to that /src directory which now contains our code. Next, we are simply running the appropriate npm scripts to install dependencies, lint our code, test our code, compile our code with build, and then removing devDependencies with npm prune --production.

Before we continue, I want to talk about the test step a little more, as that is also set up to measure coverage because we used the --coverage flag. We also passed in a jest.json file as a config. This is where coverage thresholds are defined.

// jest.json
{
"testEnvironment": "node",
"modulePaths": [
"src",
"/node_modules/"
],
"coverageThreshold": {
"global": {
"branches": 100,
"functions": 100,
"lines": 100,
"statements": 100
}
},
"collectCoverageFrom" : [
"src/**/*.js"
]
}

If you wanted to maintain 90% coverage, you would decrease each option marked as 100 to 90. The tests will fail if the coverage threshold is not met.

If you want format your code automatically the exact same way I do, here’s the .eslintrc file I use with ESLint. And, just to make your life easier, the .babelrc file I use, with Babel.

Second Stage of Build

Our Dockerfile up until this point now starts with a new Node evironment, on Alpine linux, optionally installs node-gyp tools, and then adds, lints, tests, and compiles our code, and then finally prunes away development dependencies. What we have left are all of the artifacts we need for a production build. Unfortunately we are left with additional bloat from the tools we needed to get this far. We will use a multi-stage build to copy only the artifacts we need into our final productionized container, using COPY --from=build.

# Dockerfile continued
FROM node:9-alpine
# install curl for healthcheck
RUN apk add --update --no-cache curl
ENV PORT=3000
EXPOSE $PORT
ENV DIR=/usr/src/service
WORKDIR $DIR
# Copy files from build stage
COPY --from=build /src/package.json package.json
COPY --from=build /src/package-lock.json package-lock.json
COPY --from=build /src/node_modules node_modules
COPY --from=build /src/dist dist
HEALTHCHECK --interval=5s \
--timeout=5s \
--retries=6 \
CMD curl -fs http://localhost:$PORT/_health || exit 1
CMD ["node", "dist/index.js"]

This completes our Dockerfile. The final size in my case is a 28MB container, which has my production node_modules, and my dist folder of babel-compiled javascript source code. To run, simply use vanilla node. I’ve also defined a health check that a scheduler like Docker Swarm can use to ensure the container is healthy. curl may not be the most efficient healthcheck, but it’s a good starting point.

Here’s the full Dockefile in one place for all of your copy and pasting needs!

FROM node:9-alpine AS build
# install gyp tools
RUN apk add --update --no-cache \
python \
make \
g++
ADD . /src
WORKDIR /src
RUN npm install
RUN npm run lint
RUN npm run test
RUN npm run build
RUN npm prune --production
FROM node:9-alpine
RUN apk add --update --no-cache curl
ENV PORT=3000
EXPOSE $PORT
ENV DIR=/usr/src/service
WORKDIR $DIR
COPY --from=build /src/package.json package.json
COPY --from=build /src/package-lock.json package-lock.json
COPY --from=build /src/node_modules node_modules
COPY --from=build /src/dist dist
HEALTHCHECK --interval=5s \
--timeout=5s \
--retries=6 \
CMD curl -fs http://localhost:$PORT/_health || exit 1
CMD ["node", "dist/index.js"]

Now, simply running:

docker image build -t your-image-name .

will run 80% of our CI pipeline. Next step is testing the container with it’s integrations.

Integration Tests

For integration tests, we need to run other software, like databases, or message queues, or other services within our system, and test that they work together – that they integrate. Because this task requires running multiple images together, we will instead use docker-compose which is suited to this type of task.

Again, trying to stick with realistic, instead of over simplified examples, here is the docker-compose file I use for testing a more complicated microservice in an architecture based on CQRS and Event Sourcing, which has dependencies on redis, mongodb, and rabbitmq.

version: '2'
services:
  staging-deps:
image: your-image-name
environment:
- NODE_ENV=production
- PORT=3000
- RABBITMQ_URL=amqp://rabbitmq:5672
- REDIS_HOST=redis
- REDIS_PORT=6379
- MONGO_URL=mongodb://mongo:27017/inventory
- DEBUG=servicebus*
networks:
- default
depends_on:
- redis
- rabbitmq
- mongo
  rabbitmq:
image: rabbitmq:3.6-management
ports:
- 15672:15672
hostname: rabbitmq
networks:
- default
  redis:
image: redis
networks:
- default
  mongo:
image: mongo
ports:
- 27017:27017
networks:
- default
  staging:
image: node:8-alpine
volumes:
- .:/usr/src/service
working_dir: /usr/src/service
networks:
- default
environment:
- apiUrl=http://staging-deps:3000
- RABBITMQ_URL=amqp://rabbitmq:5672
- REDIS_HOST=redis
- REDIS_PORT=6379
- MONGO_URL=mongodb://mongo:27017/inventory
- DEBUG=$DEBUG
command: npm run test:staging
  clean:
extends:
service: staging
command: rm -rf node_modules
  install:
extends:
service: staging
command: npm run install

Pay particular attention to staging-deps and staging services.

staging-deps is running the image that was produced from running docker image build command we ran earlier. This happens by setting image: to the tag we set in the build command with -t. We are passing it a bunch of environment variables to let our service know how to connect to the different containers running in our network, which is the default network. Each docker-compose file can define networks, and has a default network by default. Docker also handles service discovery through it’s Software Defined Networks (SDNs), so the host name will resolve to the IP address of the container with the same name on the network. For example, in MONGO_URL=mongodb://mongo:27017/inventory mongo will resolve to the mongo container in the SDN. Lastly, depends_on will tell docker to start the depended on containers first when bringing them up.

staging is a container based on node and with our source code mounted into it using the volumes directive. The integration tests will be run from this container.

Additionally, there are two more “services” that we will also run as scripts: clean and install. clean to ensure that we are using the correct versions of node_modules seems how we may be switching between developing on our host OS such as Mac, and Alpine Linux, which our containers are based on.

First we will bring up the dependencies:

docker-compose -f docker-compose.staging.yml up -d staging-deps

Depending on whether or not your service accounts for retrying connections, you may want to simply start certain containers first, and sleep while they become ready. Alternatively, scripts like wait-for-it.sh can be used to accomplish this with a shell script.

Once your dependencies are up and running, you’ll need to run the staging tests. We’ve mounted the code into the container, but haven’t installed npm dependencies on it. Run the following to install dependencies for the linux container, and then run staging tests.

docker-compose -f docker-compose.staging.yml run --rm install
docker-compose -f docker-compose.staging.yml run --rm staging

If your integration/staging tests pass, you’ll be confident you can continue on with the delivery and deployment stages! But first, let’s automate running our pipeline, so we can run it on each commit.

Automating the pipeline

First, let’s create a Makefile. Our goal is to be able to run make ci to run the whole CI process.

ci:
make docker-build \
clean \
install \
staging \
staging-down
docker-build:
docker build -t your-image-name .
clean:
docker-compose -f docker-compose.staging.yml run --rm clean
install:
docker-compose -f docker-compose.staging.yml run --rm install
staging:  
docker-compose -f docker-compose.staging.yml up -d staging-deps
docker-compose -f docker-compose.staging.yml run --rm staging
staging-down:
docker-compose -f docker-compose.staging.yml down

Running on every commit

I run on every commit locally, using a npm package called husky and again on my CI server, which typically is Jenkins 2.0 Pipelines.

To run locally let’s start by installing husky to our devDependencies by running

npm i --save-dev husky

Husky makes it super simple to install and run git hooks. Simply open up your package.json and add the following two scripts:

"scripts": {
// ...
"precommit": "npm run lint:fix && npm run test",
"prepush": "make ci"
},

Now on every commit, on every developer’s machine, husky will fix all linting errors that can be automatically fixed, and run unit tests. Our tests are configured to have 100% coverage, so if any tests do not pass, or do not meet coverage thresholds, the commit will be blocked. Additionally, on every attempt to push, the entire CI process we defined will be run. This is useful for catching any deps that may have been installed locally but not saved to package.json, and ensures that everything passes when running in the OS it will be hosted on in production, as well as testing the container that is built using staging tests.

Conclusion

At this point you have everything you need to run an entire Dockerized Continuous Integration process on every commit, and prevent bad code from entering your code base. But this is far from the end of the journey. Follow me for more posts about similar topics!

As for next steps:

  • Improving Continuous Integration
    Build from this point to customize for your needs. If you are making an app, it might make sense to also run end-to-end tests with Selenium, or one of it’s flavors. If it’s a high load site, add stress testing to the pipeline. Any types of tests like these belong in the integration phase.
  • Continuous Delivery 
    Once you are building production-ized docker containers, it’d be a shame not to put them in a registry. There are paid solutions for this, and self hosted registry’s. What works best for you is up to your situation. I typically just pay for private repos on Docker Hub and call it a day.
  • Continuous Deployment
    This is where the magic truly begins to happen. When you have fully tested production optimized containers deploying themselves on every commit to master, it’s truly zen. Especially given modern microservice architectures. Typically, we are adding new services to add new features, rather than a branch, making you miss branches even less. For Continuous Deployment servers, I prefer Jenkins 2.0 Pipelines with Blue Ocean. Jenkinsfile's allow the engineer’s of each product to define their own CD pipeline’s, and Docker Stack files allow the developers to define how their services will run in production. A CD process might also involve updating a proxy, such as HAProxy or nginx.
  • Running in Production
    This is what orchestrators are for. These are tools like Docker Swarm, Kubernetes, and Mesos. The work very similarly to schedulers on the computer you are using now, that schedule processes and allocate resources that run on your machine. Instead, they are scheduling and reserving resources across a cluster of machines, letting you control them as if they were one.
Interested in hearing MY DevOps Journey, WITHOUT useless AWS Certifications? Read it now on HackerNoon.

Thanks so much for reading! Please smash that clap button and share if you found this post helpful!

Make sure to follow me for more posts on related topics! :)

Changelog
11/21/2017, 9:11 PM

“Refactored” the introduction and conclusion to address comments and questions, and improve clarity.

More by Patrick Lee Scott

Topics of interest

More Related Stories