Recently I made a small website with Django and React very quickly, but I started feeling annoyed when it came to the deployment. I have to write the deployment script and Jenkins job and also have to deal with production configuration as well as secrets! I started to consider using container for this kind of small websites. I started from reading a docker book and surprisingly I finished the book in just one hour! (I read the same book three years ago for about two weeks and still didn’t know what it was talking about, so see how efficient when you actually have needs.)
The next day I encountered and googled quite a lot of problems when trying to containerizing the API service of my website. Although there are many tutorial about docker but few of them are about best practices. So I feel it would do some help if I could write my problems and solutions down. Now let’s begin.
The website I made consists of an API service written with Django and served with gunicorn, a front end written and built with React. Database is MySQL. An nginx serves the front end static files and proxy the /api/
to the API service.
Initially I considered containerize everything into one container but figured out that approach will have no scalability (well as a personal service there’s no chance to scale though). So at the end of day I decided to only containerize the Django API service part. The nginx on the host machine will proxy API requests to the container. The front end static files will be served from a directory directly by nginx. In this way, I can move static files to AWS S3 or move API to EC2 when necessary.
My dev environment is Ubuntu 18.04 so it is natural to use Ubuntu for production as well. However I would like to try Alpine because it is secure and very small so it fits container perfectly (the latest alpine:3.8 image takes only 2.1MB after compression!).
I started with python:3.6.6-alpine3.8
and everthing went good until I started to install dependencies. Some of the dependencies of my project depend on native library (such as libxslt
and libmysqlclient
) and contains C code so I have to install gcc
, g++
and many libraries and header files. And there are potentially other compatibility issues in the future! This turned out to be a tedious work and quickly made me give up.
At the end of day I decided to build my own image from ubuntu:18.04
. The lesson I learned from this is, if you are not familiar with alpine, stick to the OS you used for development. This may result larger image size but will save tons of time for you.
apt
will install many “suggested” but usually useless packages when called without any options. However in a container context, we need exact packages to make container image small and secure.
The first trick is to add --no-install-recommends
option. This option can reduce the image size significantly. In the following example, 3.6–1
is generated with apt install -y --no-install-recommends python3 python3-pip wget
and 3.6
is generated with the same command except for the --no-install-recommends
option. See how big the difference is!
REPOSITORY TAG IMAGE ID CREATED SIZEodacharlee/python 3.6-1 ffe5c583b8c4 3 minutes ago 124MBodacharlee/python 3.6 188132621a87 3 hours ago 405MB
The second trick is to remove cache after install. Add rm -rf /var/lib/apt/lists/*
will reduce image size for about 40MB.
The last trick is to use only one RUN
statement for apt update
and apt install
. This can generate less layers in the image.
To summarize, your Dockerfile should look like this:
RUN apt update \&& apt install -y --no-install-recommends python3 python3-pip \&& rm -rf /var/lib/apt/lists/*
For python applications we need pip
to install dependencies. It becomes trickier if any packages depend on native library, because pip
needs to compile the dependencies from source code.
For example, my application requires libmysqlclient-dev
, python3-dev
to be installed before running pip
, in order to install mysqlclient
package. And of course a compilergcc
is required as well. However these packages are useless after pip install
is done and can be removed. So the Dockerfile can be written as:
RUN buildDeps='gcc libmysqlclient-dev python3-dev' \&& apt update \&& apt install -y --no-install-recommends $buildDeps \&& pip install wheel setuptools \&& pip install -r requirements.txt \&& apt purge -y --auto-remove $buildDeps \&& apt install -y libmysqlclient20 \&& rm -rf /var/lib/apt/lists/* \&& rm -rf /root/.cache
The apt purge
line removes those packages and rm -rf /root/.cache
removes pip cache. Note that libmysqlclient20
is reinstalled after apt purge
because this package is required during runtime.
A registry service is used to store your docker images, such as Docker Hub. But sometimes we need to create our own registry for private images.
Although you can start a registry service by docker run registry
, I would recommend using cloud services instead of hosting your registry yourself. Both Google and Amazon provide container registry services.
Choose one based on your usage.
MySQL container is super easy to setup if you know the correct command. It looks like this:
$ docker run -d \--name db_mysql5 \--network web-net \--env MYSQL_RANDOM_ROOT_PASSWORD=1 \mysql:5
-d
will detach your terminal from the container. Without -d
you will be stuck in the container and the only way of getting out is to run docker stop db_mysql5
from another terminal.
--name db_mysql5
gives a name to the container and --network web-net
connects the container to a network called web-net
which I created with docker network create -d bridge web-net
in advance.
--env MYSQL_RANDOM_ROOT_PASSWORD=1
tells MySQL to generate a random password for root user when initializing. The password can be seen from the container console:
$ docker logs db_mysql5 2>/dev/nullInitializing databaseDatabase initializedInitializing certificatesCertificates initializedMySQL init process in progress...GENERATED ROOT PASSWORD: raew8pej9noomohGhaew3WoP4euch6za
After mysql container is up, you can change the docke password like this:
$ docker exec -i -t db_mysql5 mysql -uroot -pEnter password: raew8pej9noomohGhaew3WoP4euch6za...mysql> alter user 'root'@'localhost' identified by 'mypassword';
This is about creating the mysql account for your web app.
Since app server is not running on the same container with mysql, it cannot use “localhost” user (such as web@localhost
) to connect to mysql. So we have to specify %
as the host part of the user:
mysql> create user 'web'@'%' identified by 'webpassword';
Secrets such as mysql username and password cannot be written into source code (e.g. settings.py
of django) so we need another way to pass them to the app container. Although docker secrets can do this elegantly, it can be used in Swarm only, and I feel it is overkill for my service. So I decided to simply use environment variables.
I made an entrypoint script in my app container:
#!/bin/sh
# Check production environment variablesif [ -z "$EC_MYSQL_USER" -o -z "$EC_MYSQL_PASS" ]; thenecho >&2 'error: Must specify EC_MYSQL_USER and EC_MYSQL_PASS'exit 1fi
if [ "$1" = "gunicorn" ]; then./manage.py migratefi
# Start processexec "$@"
In Dockerfile:
ENTRYPOINT ["/app/entrypoint.sh"]CMD ["gunicorn", "-c", "/app/gunicorn.conf", "myapp.wsgi"]
And don’t forget to use these variables in django settings.py
:
DATABASES = {'default': {'ENGINE': 'django.db.backends.mysql','NAME': 'mydatabase','HOST': 'db_mysql5','USER': os.environ.get('EC_MYSQL_USER'),'PASSWORD': os.environ.get('EC_MYSQL_PASS'),}}
So the secrets can be specified in the docker command line:
$ docker run -it --network web-net --name myapp \--env EC_MYSQL_USER=web \--env EC_MYSQL_PASS=mysecret \-p 12345:12345 \myapp:latest
In the actual deployment, the secrets will be stored in jenkins and Mask Passwords plugin can be used to hide the passwords from the jenkins logs.
gunicorn is used to run the app server. Without container, we would make gunicorn listen to 127.0.0.1:12345
, and nginx will proxy requests to this port. But when the app server is running inside the container, and the nginx is on the host, gunicorn must listen to all interfaces:
# gunicorn.confbind = '0.0.0.0:12345' # '127.0.0.1:12345' will not work!
Only in this way the container port mapping could work:
$ docker run -it --network web-net --name myapp \--env EC_MYSQL_USER=web \--env EC_MYSQL_PASS=mysecret \-p 12345:12345 \myapp:latest
The problem we have today is not lack of information. The problem is that we have so much information that we cannot figure out which piece fits our needs best. I hope this post can help beginners make the correct decision when they begin their first docker projects after learning docker.
Thanks for reading, and please clap for me if you feel useful!