A 10-steps checklist on how to dockerize any application. There are already many tutorials on how to applications available on the internet, so ? dockerize why I am writing another one Most of the tutorials I see are focused on a (say Java or Python) which may not cover what you need. They also do not all the that are necessary to establish a well defined contract between and teams (which is what is all about). specific technology address relevant aspects Dev Ops containerization I compiled the steps below based on my recent and . It is a checklist of details and things that are overlooked by the other guides you will see around. experiences lessons learned Disclaimer: This is NOT a beginners guide. I recommend you learn the basics of how to setup and use docker at first, and come back here after you have created and launched a few containers. Let’s get started. 1. Choose a base Image There are many specific base images, such as: technology https://hub.docker.com/_/java/ https://hub.docker.com/_/python/ https://hub.docker.com/_/nginx/ If none of they works for you, you need to start from a Base OS and install everything by yourself. Most of the tutorials out there, will start with Ubuntu (e.g. ubuntu:16.04), which is not necessarily wrong. My advice is for you to consider to use images: Alpine https://hub.docker.com/_/alpine/ They provide a much smaller base image (as small as 5 MB). Note: “apt-get” commands will not work on those images. Alpine use its own package repository and tool. For details see: https://wiki.alpinelinux.org/wiki/Alpine_Linux_package_management https://pkgs.alpinelinux.org/packages 2. Install the necessary packages This is usually trivial. Some details you may be missing: a-) You need to write apt-get and apt-get on the same line (same if you are using apk on Alpine). This is not only a common practice, , otherwise the “apt-get update” temporary image (layer) can be cached and may not update the package information you need immediately after (see this discussion ). update install you need to do it https://forums.docker.com/t/dockerfile-run-apt-get-install-all-packages-at-once-or-one-by-one/17191 b-) Double check if you are installing what you (assuming you will run the container on production). I have seen people installing and other inside their images. ONLY really need vim development tools If necessary, create a different Dockerfile for build/debugging/development time. This is not only about image size, think about security, maintainability and so on. 3. Add your custom files A few hints to improve your Dockerfiles: a-) Understand the different between COPY and ADD: https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#add-or-copy b-) (Try to) Follow File System conventions on where to place your files: http://www.pathname.com/fhs/ E.g. for interpreted applications (PHP, Python), use folder. /usr/src c-) Check the attributes of the files you are adding. If you need execution permission, there is no need to add a new layer on your image (RUN chmod +x …). Just fix the original attributes on your code repository. There is no excuse for that, even if you are using Windows, see: _There's no need to do this in two commits, you can add the file and mark it executable in a single commit…_stackoverflow.com How to create file execute mode permissions in Git on Windows? 4. Define which user will (or can) run your container First, take a break and read the following article: great _Understanding how usernames, group names, user ids (uid) and group ids (gid) map between the processes running inside a…_medium.com Understanding how uid and gid work in Docker containers After reading this you will understand that: a-) You only need to run your container with a specific (fixed ID) user if your application need access to the user or group tables ( or ). /etc/passwd /etc/group b-) Avoid running your container as root as much as possible. Unfortunately, it is not hard to find popular applications requiring you to run them with specific ids (e.g. with uid:gid = 1000:1000). Elastic Search Try not to be another one… 5. Define the exposed ports This is usually a very trivial process, please, just don’t create the need for your container to run as root because you want it to expose a privileged low port (80). Just expose a non privileged port (e.g. 8080) and map it during the container execution. This differentiation comes from a long time ago: https://www.w3.org/Daemon/User/Installation/PrivilegedPorts.html 6. Define the entrypoint The vanilla way: just run your executable file, right away. A better way: create a “ ” script where you can hook things like configuration using environment variables (more about this below): docker-entrypoint.sh This is a very common practice, a few examples: _elasticsearch-docker - Official Elasticsearch Docker image_github.com elastic/elasticsearch-docker _postgres - Docker Official Image packaging for Postgres_github.com docker-library/postgres 7. Define a Configuration method Every application requires some kind of parametrization. There are basically two paths you can follow: 1-) Use an application specific configuration file: them you will need to document the format, fields, location and so on (not good if you have a complex environment, with applications spanning different technologies). 2-) Use (operating system) Environment variables: Simple and efficient. If you think this is not modern or recommended approach, remember this is part of : The Twelve-Factors _A methodology for building modern, scalable, maintainable software-as-a-service apps._12factor.net The Twelve-Factor App This does not mean that you need to throw away your configuration files and refactor the config mechanism of your application. Just use a simple command to replace a configuration template (inside the , because it needs to be performed on run time). envsubst docker-entrypoint.sh Example: _Official build of Nginx. GitHub repo: https://github.com/nginxinc/docker-nginx Library reference This content is…_docs.docker.com nginx This will encapsulate the application specific configuration file, layout an details inside the container. 8. Externalize your data The golden rule is: . do not save any persistent data inside the container The container file system is supposed and intended to be temporary, ephemeral. So any user generated content, data files, process output should be saved either on a or on a (that is, on a folder on the Base OS linked inside the container). mounted volume bind mounts I honestly do not have a lot of experience on mounted volumes, I have always preferred to save data on a , using a previously created folder using a tool (such as Salt Stack). bind mounts carefully defined configuration management As , I mean the following: carefully created I create a non privileged user (and group) on the Base OS. All bind folders (-v) are created using this user as owner. Permissions are given accordingly (only to this specific user and group, other users will have no access to that). The container will be run with this user. You will be in full control of that. 9. Make sure you handle the logs as well I am aware that my previous “ ” is far from being a precise definition, and logs sometimes fall into the grey area. How should you handle them? persistent data If you are creating a new app and want it to stick to conventions, no logs should be written at all. The application should use and as an . Just like the environment variables recommendation, it is also one of . See: docker files stdout stderr event stream The Twelve-Factors _A methodology for building modern, scalable, maintainable software-as-a-service apps._12factor.net The Twelve-Factor App Docker will automatically capture everything you are sending to and make it available through “ ” command: stdout docker logs https://docs.docker.com/engine/reference/commandline/logs/ There are some practical cases where this is particularly difficult though. If you are running a simple nginx container, you will have at least two different types of log files: HTTP Access Logs Error Logs With different structures, configurations and pre existing implementations, it may not be trivial to pipe them on the standard output. In this case, just handle the log files as described on the previous section, And make sure you rotate them. 10. Rotate logs and other append only files If your application is writing log files or appending any files that can , you need to worry about file . grow indefinitely rotation This is critical for you to prevent the server running out of , apply data policies (which is critical when it comes to GDPR and other data regulations). space retention If you are using , you can count on some help from the Base OS and use the same tools you would use for a local rotation configuration, that is (manual ). bind mounts logrotate here A simple yet complete example I found recently is this one: _Manage log rotations using the Linux tool, logrotate, with the Aerospike in-memory NoSQL database._www.aerospike.com Configure - Log Rotate Another good one: _Logrotate is a system utility that manages the automatic rotation and compression of log files. If log files were not…_www.digitalocean.com How To Manage Logfiles with Logrotate on Ubuntu 16.04 | DigitalOcean — Let me know if you have any feedbacks. Check it out my other technical articles on https://hackernoon.com/@htssouza