We are going to set up a development environment for a project consisting of various services. All of these services will be containerized with Docker, and they will all run simultaneously during development using docker-compose.
Our environment will feature instant code reloading, test-driven development, database connectivity, dependency management, and more™. It will be possible to easily deploy it to production with docker-compose or with Rancher. And as a bonus, we’ll set up continuous integration on Gitlab.
The article is about efficiency, so I’ll get straight to the point.
We want to:
Let’s make it happen.
You are going to need to install the following tools:
We’ll set up a project consisting of a Python and Java service, together with a Postgres database. The Postgres database will run on our own machine in the development environment, but is assumed to be external during production (it might use Amazon RDS for example).
The Python service contains unit tests supported by Pytest, for which we will set up test-driven development. The Java service uses Maven for its build process.
Finally, we will use Gitlab’s container registry and Gitlab’s CI service. The code as described below is also available in a Github or Gitlab repository.
This setup should demonstrate most essential concepts. However, the approach described below should work regardless of technology.
The file structure:
|/myproject| /python| /mypackage| run.py| /tests| my test.py| Dockerfile| setup.py| requirements.txt|| /java| Dockerfile| pom.xml| /src| /main| /java| /com| /example| /Main.java|| docker-compose.common.yml| docker-compose.dev.yml| docker-compose.prod.yml| Makefile| python-tests.sh| .gitlab-ci.yml
The Dockerfile for the Python service is as follows:
FROM python:3.6-slim
COPY . /codeWORKDIR /code
RUN pip install --no-cache-dir -r requirements.txtRUN pip install -e .
ENTRYPOINT python ./mypackage/run.py
This adds the service code to the container, installs its dependencies (contained in the requirements.txt, which in this example will contain pytest
and watchdog
), and installs the Python service itself. It also defines the command to be executed when the container is started.
The Dockerfile for the Java service can be found below:
FROM maven:3.5-jdk-8
COPY . /usr/src/appWORKDIR /usr/src/app
RUN apt-get update && apt-get install entr -y
RUN mvn clean package --batch-modeENTRYPOINT java -jar target/docker-compose-java-example-1.0-SNAPSHOT.jar
Like the Python Dockerfile, this also first adds the code to the container. It then proceeds to install the Unix utility entr
which we will need later. Maven is used to create a JAR file, after which we define the container command to execute the JAR file.
Finally, the docker-compose.common.yml file forms the basis for our environment, and contains all configuration that is important for the application, regardless of environment in which it is being executed. It is fairly straightforward:
version: '2'
services:python:build: ./pythonenvironment:- POSTGRES_USER- POSTGRES_PASSWORD- POSTGRES_DB- POSTGRES_HOST
java:build: ./javaenvironment:- POSTGRES_USER- POSTGRES_PASSWORD- POSTGRES_DB- POSTGRES_HOST
That gives us all the ingredients to create a development configuration.
Let’s have a look at the docker-compose.dev.yml file. It might seem daunting but don’t worry, we’ll go through it step by step below.
version: '2'
services:python:image: registry.gitlab.com/mycompany/myproject/python:devvolumes:- ./python/:/codeentrypoint: watchmedo auto-restart --recursive --pattern="*.py" --directory="." python mypackage/run.pydepends_on:- postgreslinks:- postgresenvironment:- POSTGRES_USER=user- POSTGRES_PASSWORD=password- POSTGRES_DB=myproject- POSTGRES_HOST=postgres
python-tests:image: registry.gitlab.com/mycompany/myproject/python:devvolumes:- ./python/:/codeentrypoint: watchmedo auto-restart --recursive --pattern="*.py" --directory="." pytestdepends_on:- python
java:image: registry.gitlab.com/mycompany/myproject/java:devvolumes:- ./java/:/usr/src/appentrypoint: sh -c 'find src/ | entr mvn clean compile exec:java --batch-mode --quiet'depends_on:- postgreslinks:- postgresenvironment:- POSTGRES_USER=user- POSTGRES_PASSWORD=password- POSTGRES_DB=myproject- POSTGRES_HOST=postgres
postgres:image: postgres:9.6environment:- POSTGRES_USER=user- POSTGRES_PASSWORD=password- POSTGRES_DB=myprojectvolumes:- /data/aedspy/postgres:/var/lib/postgresql/data
pgadminer:image: clue/adminerports:- "99:80"
Let’s start with the Python service. I’ll point out the interesting parts.
volumes:
What effectively happens here is that the python
subdirectory on our host machine, containing the code for our Python service, is now mapped to the/code
directory in our container. To answer the question on why that’s relevant, let’s have a quick look again at the relevant lines in the Python Dockerfile:
COPY . /codeWORKDIR /code
Without the volumes
statement in the docker-compose.dev.yml file, the contents of the python
subdirectory would simply be added to the container. If a change is made on the host machine, the container has to be rebuild before we can see those changes within the container.
However, with the volumes
statement contained in the docker-compose.dev.yml file, any changes you make will immediately be reflected inside the container. This is because both directories now point to the exact same files.
The next lines in the docker-compose.dev.yml file make use of this:
entrypoint: watchmedo auto-restart --recursive --pattern="*.py" --directory="." python mypackage/run.py
This line overrides the entrypoint
for the container (which is the command that is executed with you start the container). The default entrypoint
was defined in the Dockerfile, as follows:
ENTRYPOINT python ./mypackage/run.py
Thanks to the entrypoint
statement in our docker compose file, however, this entrypoint will now be replaced by the command starting with watchmedo
. The watchmedo
command is part of the watchdog package which we included in the requirements.txt
file. It monitors files with the provided pattern (in this case, all *.py
files) in a given directory, and if any of them is modified, watchmedo
will restart the running process and execute the provided command (in this case, python ./mypackage/run.py
).
This line, combined with the volume mapping we’ve mentioned earlier, means that every modification of any Python file in the ./python
directory on our host machine will restart our application. If you open any Python file and modify it, you will see that every change you make will immediately be reflected in the running container.
It might be just me, but this is one of the coolest things I’ve ever seen.
Note: Keep in mind that you do need to rebuild the image if you add a new dependency.
Let’s have a look at the docker-compose.dev.yml file for the Python unittests service, named python-tests
:
python-tests:image: registry.gitlab.com/mycompany/myproject/python:deventrypoint: watchmedo auto-restart --recursive --pattern="*.py" --directory="." pytestdepends_on:
Interesting to note is that the image is the same as the image of the Python service. This means that it will use the exact same environment that the Python service uses; the image will only be built once.
The depends_on
statement advises docker-compose to build the python
service before running the python-tests
service.
But the most important line is, once again, the entrypoint
. We do something quite similar here to what we do in the regular Python service, but instead, we now let watchemedo
execute pytest
on every modification (which, if you recall, was also included in the requirements.txt
file).
The result of this is that every code change will now automatically execute all tests that pytest can find, giving you instant feedback on the status of your tests.
This makes test-driven development with Docker trivial.
Java, by being a compiled language, is a little bit more complicated to get working. Fortunately, Maven helps us most of the way.
The first important thing to note is the following line in the Dockerfile:
RUN apt-get update && apt-get install entr -y
What happens here is that the command line tool [entr](http://entrproject.org/)
is installed. It functions very similar to watchdog’s watchmedo
command that we used with Python, but the advantage is that it doesn’t need Python to function; it’s just a general purpose Unix utility. In fact, we could have used it in the Python service as well, but, well, we didn’t.
We can see it in action in the docker-compose.dev.yml
file, in the entrypoint of the java
service:
entrypoint: sh -c 'find src/ | entr mvn clean compile exec:java --batch-mode --quiet'
This says that ‘whenever any file in the directory src/
changes, ask maven to clean, compile and then execute the Java project’.
None of this will work out of the box; Maven first requires some fairly extensive configuration in the pom.xml
file. More specifically, we require a few plugins, namely the Maven compiler plugin, the Maven JAR plugin to execute the container’s default entrypoint (java -jar
) and the Exec Maven plugin to run during development (use the Java goal).
Once this is all configured, together with the other content what makes up a valid pom.xml
file, the result is the same (and actually slightly better) as with the Python service: every change to a Java source file restarts the application, compiles the Java files, installs new dependencies (thank you Maven!) and restarts the application.
Let’s again look at the relevant lines in the docker-compose.dev.yml file:
postgres:image: postgres:9.6environment:- POSTGRES_USER=user- POSTGRES_PASSWORD=password- POSTGRES_DB=myprojectvolumes:- /data/myproject/postgres:/var/lib/postgresql/data
pgadminer:image: clue/adminerports:- "99:80"
The postgres
service uses a standard Postgres image, which comes with a default configuration. The environment variables configures Postgres by defining a database called “myproject”, with as username “user” and password “password”.
The Python and Java services also define these environment variables. They are expected to use these in the application code to connect to the database. In the docker-compose.dev.yml
file these values are all hardcoded. When building the production container, however, it is expected that the production values are passed in as environment variables from an external source. This allows for integration with a proper secrets management toolchain.
As last instruction for the postgres
service, a volume
is defined. This maps the Postgres database data to a location on your host machine. While not strictly necessary, this is useful to persist the data of Postgres if the container is deleted for whatever reason.
Finally, we also define the pgadminer
service, which starts adminer, a useful tool for doing database management through a web interface. With this configuration, it will be accessible through port 99 in your host machine (so http://127.0.0.1:99). As hostname you should use the name of the Postgres service, which is postgres
in this case, as they share the same Docker network by virtue of being defined in the same docker compose file, and as such DNS is performed automagically for you.
Now let’s take it for a spin.
First we have to build all containers for development. From your command line:
docker-compose -f docker-compose.common.yml -f docker-compose.dev.yml build
And to start all services:
docker-compose -f docker-compose.common.yml -f docker-compose.dev.yml up
As this is a lot to type, and because I like self-documenting entrypoints, I tend to define these and other essential project-wide commands in a Makefile:
dev-build:docker-compose -f docker-compose.common.yml -f docker-compose.dev.yml build --no-cache
dev:docker-compose -f docker-compose.common.yml -f docker-compose.dev.yml up
Probably you will want to run your Python unit tests without bringing up all your services at some point. For that reason, we define python-tests.sh as follows:
#!/bin/bashdocker-compose -f docker-compose.common.yml -f docker-compose.dev.yml run --rm --entrypoint pytest python $*
This will execute pytest
in the python
container, executing all tests. Any arguments provided to the script will be directly passed to the pytest command in the container (thanks to the $*
), allowing you to run it like you would normally run pytest. Finally, we extend the Makefile with the following:
test:./python-tests.sh
Almost there. Let’s have a look at docker-compose.prod.yml:
version: '2'
services:python:image: $IMAGE/python:$TAGrestart: always
java:image: $IMAGE/java:$TAGrestart: always
That’s really all there is to it. Most of the configuration should be in docker-compose.common.yml, and the commands and entrypoints are all in the Dockerfiles. You do need to pass in the environment variables that have no value yet (defined in docker-compose.common.yml and in this file) however, but that should be handled by your build script.
With this, we are ready to build the services for production. So let’s do exactly that, and build it with the CI service from Gitlab. Let’s have a look at the .gitlab-ci.yml file. It’s quite bare-bones and allows for optimization, but it gets the job done.
stages:
variables:TAG: $CI_BUILD_REFIMAGE: $CI_REGISTRY_IMAGE
services:
image: docker
before_script:
build:stage: buildscript:- docker-compose -f docker-compose.common.yml -f docker-compose.prod.yml build- docker-compose -f docker-compose.common.yml -f docker-compose.prod.yml push
test-python:stage: testscript:- docker-compose -f docker-compose.common.yml -f docker-compose.prod.yml pull python- docker-compose -f docker-compose.common.yml -f docker-compose.prod.yml run --rm --entrypoint pytest python
There are a few Gitlab CI specific things in here, such as the definition of the docker:dind
service and the image within which to run the build, both required to have docker available. Also, the before_script
part is important, as it installs docker-compose
(because I couldn’t find a good image with an up to date version). You will also notice that the $TAG
and $IMAGE
variables are set by using the environment variables passed in by Gitlab’s CI runner by default.
Furthermore, Gitlab has a concept of secret variables that are passed as environment variables in your build. Simply set the right values and docker-compose
will pick them up. If you use another CI environment, I’m sure it has some mechanism for this as well. And if you prefer to be a little more low-tech, you can of course also write your own deploy script.
So there you have it; an efficient but not very complicated setup to orchestrate and develop most projects inside Docker.
If you’re interested, I’ve also written an article on setting up simple deployment with docker-compose on Gitlab. That should take you from a development environment that builds on Gitlab CI right to continuous deployment with docker-compose.