This article describes containerization best practices throughout the full lifecycle of a containerized workload; with emphasis on development and security. We will look at:
You will find it useful if you are a software developer starting your journey with developing in containers. Even a senior developer might pick up a few tricks here and there.
There is also something for security professionals as well as automation engineers or SREs (Ops).
A little disclaimer, if your title is DevOps Engineer, please don’t feel left out. You will surely benefit from the content of this article. It’s just that DevOps is not a title neither a role nor a team, but rather a philosophy and culture. Unfortunatelly in most companies, DevOps really means automation engineering and soft-ops (mostly configuring and dealing with Kubernetes and other complex software). So if you read somewhere “automation engineer”, that means a DevOps engineer.
This document intends to serve as a framework and guide for developing and operationalizing containerized software. This article is about containers only, if you are interested in containers orchestration, check out my two previous blogs, orchestrating containers with Kubernetes and developing on Kubernetes.
There is a lot of ground to cover, so let’s get started!
Container
A container is the runtime instantiation of a Container Image. A container is a standard Linux process often isolated further through the use of cgroups and namespaces.
Container Image
A container image, in its simplest definition, is a file that is pulled down from a Registry Server and used locally as a mount point when starting Containers.
Container Host
The container host is the system that runs the containerized processes, often simply called containers.
Container Engine
A container engine is a piece of software that accepts user requests, including command-line options, pulls images, and from the end user’s perspective runs the container. There are many container engines, including docker, RKT, CRI-O, and LXD.
Images Registry
A registry server is essentially a fancy file server that is used to store docker repositories. Typically, the registry server is specified as a normal DNS name and optionally a port number to connect to
This documentation assumes basic knowledge of Docker and Docker CLI. To learn or refresh on container-related concepts, please refer to the official documentation:
Please note that since most development activities will start on “docker stack” (docker CLI, docker CE, docker desktop, etc), most of the time we will refer to docker tooling. There are a lot of alternatives to every mentioned component. For example podman, buildah, buildpacks and many other technologies that are not coming from Docker the company.
The same goes for containers OS, some windows containers are outside of the scope of this article.
For detailed information about docker architecture, please refer to Docker or Mirantis documentation. Here is a handy diagram explaining high-level docker architecture and its components.
Sources:
When you start developing containerized workloads, there are a lot of similarities with developing regular software, but also a few key differences. The below diagram provides a simplified view of various stages of containerized workload lifecycle.
Docker CLI has the following syntax:
Syntax: docker <docker-object> <sub-command> <-options> <arguments/commands>
Example: docker container run -it ubuntu
By default, all docker image layers are immutable (read-only). When a container is created using docker run
command, an additional mutable (read-write) layer is created. This layer is only there for the duration of the container lifetime and will be removed once the container exits. When modifying any files in a running container, docker creates a copy of the file and moves it to the container layer (COPY-ON-WRITE) before changes are saved. Original files as part of the image are never changed.
On machine form where you want to access docker host, setup variable:
export DOCKER_HOST="tcp://<docker-host-ip>:2375"
Docker default ports:
2375 — unencrypted traffic
2376 — encrypted traffic.
IMPORTANT*: This setting is only for testing/playground purposes. It will make docker host available on the network and by default there is no authentication.*
sudo groupadd docker
sudo useradd -G docker <user-name>
sudo usermod -aG docker <non-root user>
sudo systemctl restart docker
It is highly recommended to use VS Code with a Docker plugin for developing with containers.
here is a good write up about hot to setup and use Docker extension with VS Code
If you are using VS Code with a Docker extension, you can quickly create a Dockerfile stub for your project.
Docker: Add Docker Files to Workspace
To build an image you can use a docker CLI docker build --progress=plain -t imagename:tag -f Dockerfile .
or use VS Code Docker extension to do the same
the
_--progress=plain_
flag creates verbose output to stdout and is enabled by default when using Docker extension.
When creating a Dockerfile, each new command such as RUN, ADD, COPY etc creates a new intermediate container that you can exec into and debug.
The debugging steps differ if docker host supports new build mechanism with
_buildkit_
(from version 1.18 onwards) or old build mechanism with docker build. Buildkit debugging is relativelly complex, so it is easier to drop to the docker build way using_DOCKER_BUILDKIT=0_
before running docker build command. This setting will temporary switch build to legacy one.
DOCKER_BUILDKIT=0 docker build --rm=false -t wrongimage -f Dockerfile.bad .
Step 17/19 : WORKDIR /app ---> Running in 21b793c569f4 ---> 0d5d0c9d52a3Step 18/19 : COPY --from=publish /app/publish1 .COPY failed: stat app/publish1: file does not exist
--rm=false
intermediate images are not removed and we can list them using docker image ls
docker run -it 0d5d0c9d52a3 sh
Applications running in containers can be directly debugged from an IDE when a launch.json
the file is present and contains instructions on how to launch and debug a docker container.
it is strongly recommended to use VS Code with a Docker extension to easily add Dockerfile and debugging settings to the project.
cd
into project directorycode .
to open VS Codedocker: initialize for debugging
and follow the wizardRun and Debug
view Ctrl+Shift+DDocker .NET Launch
In a multi-stage build, you create an intermediate container — or stage — with all the required tools to compile or produce your final artefacts (i.e., the final executable). Then, you copy only the resulting artefacts to the final image, without additional development dependencies, temporary build files, etc.
A well crafted multistage build includes only the minimal required binaries and dependencies in the final image and does not build tools or intermediate files. This reduces the attack surface, decreasing vulnerabilities.
It is safer, and it also reduces image size.
Consider below Dockerfile building a go API. The use of multistage build is explained in file comments. Try it yourself!
Use the minimal required base container to follow Dockerfile best practices.
Ideally, we would create containers from scratch, but only binaries that are 100% static will work.
Distroless are a nice alternative. These are designed to contain only the minimal set of libraries required to run Go, Python, or other frameworks.
Container images should be small and contain only components/packages necessary for the containerized workload to work correctly. This is important for two main reasons:
docker-slim comes with many options. It supports slimming down images, scanning Dockerfiles etc. The best way to start with it is to follow steps in demo setup.
Use .dockerignore
to exclude unnecessary files from building in the container. They might contain confidential information.
Docker uses biuildkit by default for building images. One of buildkit features is the ability to mount secrets into docker images using RUN --mount=type=secret
. This is for the scenario where you need to use secrets during the image build process, for example pulling credentials from git etc.
Here is an example of how to retrieve and use a secret:
export SUPERSECRET=secret
RUN --mount=type=secret,id=supersecret
, this will make the secret available inside the image under /run/secrets/supersecret
export DOCKER_BUILDKIT=1docker build --secret id=supersecret,env=SUPERSECRET .
this will safely add from the environmental variable SUPERSECRET into the container. Examining image history or decomposing layers will not reveal the secret.
Consider creating separate Dockerfiles for different purposes. For example, you can have a dedicated docker file with testing and scanning tooling preinstalled and run it during the local development phase.
Remeber, you can build imaged from different docker files by passing
_-f_
flag, for example
docker build -t -f Dockerfile.test my-docker-image:v1.0 .
Docker-compose specification is a developer-focused standard for defining cloud and platform-agnostic container-based applications. Instead of running containers directly from a command line using docker CLI
consider creating a docker-compose.yaml
describing all the containers that comprise your application.
Please note that applications described with docker compose specification is fully portable, so you can run it locally or in Azure Container Instances
If you already have a docker-compose file and need a kick-start with generating Kubernetes YAML files, use kompose.
kompose
allows for quick conversion from docker-compose.yaml
file to native Kubernetes manifest files.
You can download Kompose binaries from the home page
Docker run commands can quickly represent the imperative style of interacting with containers. Docker-compose file on the other hand is a proffered, declarative style.
Composerize is a neat little tool that can quickly turn a lengthy docker run
command into a docker-compose.yaml
file.
composerize can generate docker-compose files either from CLI or a web based interface.
Here is an example of converting a docker run command from one of my images:
CPU
Default CPU share per container is 1024
Option 1: If the host has multiple CPUs, it is possible to assign each container a specific CPU.
Option 2: If the host has multiple CPUs, it is possible to restrict how many CPUs can be given container use.
It’s worth noting that container orchestrators (like Kubernetes) provide declarative methods to restrict resources usage per run-time unit (pod in the case of Kubernetes).
Memory
Option 1: Run container with--memory=limit
flag to restrict the use of memory. If a container tries to consume more memory than its limit, the system will kill it exiting the process with Out Of Memory Exception (OOM). By default container will be allowed to consume the same amount of SWAP space as the memory limit, effectively doubling the memory limit. Providing of course that SWAP space is not disabled on the host.
Ports mapping always goes from HOST to CONTAINER, so -p 8080:80
would be a mapping of port 8080 on the host to port 80 on the container.
Hint: Prefer using “-p” option with static port when running containers in production.
When using open-source images, it is critical to scan for security vulnerabilities. Fortunately, there are a lot of commercial as well as open-source tools to help with this task.
Using trivy is trivial ;) trivy image nginx
reveals a list of vulnerabilities with links to CVEs
Additionally, to scanning images, trivy can also search for misconfigurations and vulnerabilities in Dockerfiles and other configurations.
Here is a result of trivy scan over a sample project:
As part of your development process, ensure good linting rules for your Dockerfiles.
A good example is a simple tool called FROM:Latest developed by Replicated.
Below is a screenshot of the tool with recommendations:
Consider installing linting plugins to your editor of choice as well as run linting as part of your CI process.
Docker and similar tools provide an option for inspecting an image.
docker inspect [image name] --format
- this command will display information about the image in JSON format.
You can pipe the output of the command to
_jq_
and query the result. For example, if you have and nginx image, you could easily query for environment variables like so_docker inspect nginx | jq '.[].ContainerConfig.Env[]'_
This information however is rather rudimentary. To inspect the image even deeper, use dive
Follow the installation instructions for your system. Dive shows details of image content and commands used to create layers.
If you cannot install tools like dive, it is possible to decompose a container image using this simple method.
Container images are just tar files containing other files as layers.
Here is how to extract and save an Nginx image and inspect its content:
docker save nginx > nginx_image.tar mkdir nginx_image cd nginx_image tar -xvf ../nginx_image.tar tree -C
Each layer corresponds to command in Dockerfile. Extracting a layer.tar
file will reveal the files and settings of this layer.
Supply chain attacks have recently increased in frequency. Trusted and verifiable source code and traceable software bill of materials are critical to the security and integrity of the whole ecosystem.
You can sign your images using tools from the SigStore project
Sigstore is part of Linux Foundation and defines itself as “A new standard for signing, verifying and protecting software”.
There are many tools under SigStore’s umbrella, but we are interested in Cosign. Follow the installation steps from the Cosign repo.
Here is how to sign your image and push it to the Docker hub:
cosign generate-key-pair #this will generate 2 files, one with private and one with public key cosign sign -key cosign.key <dockeruser/image:tag>
Shipping containerized software has become easier and more streamlined due to standardized packaging (image) and runtime (container). CI/CD and systems automation tooling benefits from this greatly.
Nowadays pipelines follow the “X-As Code” movement and are expressed as YAML files and hosted alongside source code files in a git repository.
The exact syntax of those YAML files will vary from provider to provider. Azure DevOps, GitHub, GitLab, etc will have their variations.
Nevertheless, there are a few key components. Here is a sample YAML pipeline file for Azure DevOps with the most important definitions:
There is much more to CI/CD pipelines in general, the emphasis here is on actually incorporating a pipeline from the start with your project.
To increase security consider building images in pipelines using Kaniko or Buildah instead of Docker.
Both tools do not depend on a Docker daemon and execute each command within a Dockerfile completely in userspace. This enables building container images in environments that can’t easily or securely run a Docker daemon, such as a standard Kubernetes cluster. Whereas Kaniko is more oriented towards building images in Kubernetes cluster, Buildah works well with only docker images.
Image scanning refers to the process of analyzing the contents and the build process of a container image in order to detect security issues, vulnerabilities or bad practices.
Recommendation: there are three major image scanning tools currently available: Snyk, Sysdig and Aqua. My recommendation is to use Snyk, for more detailed comparison check out this blog
Follow those best practices when integrating image scanning with your CI/CD pipelines:
For detailed explanation on how to integrate image scanning using Synk with Azure Pipelines for example, please refer to Snyk documentation
Nowadays operations on raw containers (without orchestrator) are happening mostly for simpler workloads or in non-production environments. Exception from this is IoT or edge devices but even there Kubernetes rapidly takes over.
Installing docker engine on a Linux distro is pretty straightforward. Please follow the installation steps from Docker documentation.
Installing docker engine on Windows Server is a bit more difficult, follow this tutorial to install and configure all prerequisites.
By default only windows containers will run on Windows Server. Linux containers must be additionally switched on (part of the documentation above)
Once the docker host is installed you can use Portainer to interact with the monitor and troubleshoot.
Choose the installation option depending on the environment you are in.
Sample Portainer dashboard
Once installed, docker creates a folder under
_/var/lib/docker/_
where all the containers, images, volumes and configurations are stored. Kubernetes and Docker Swarm store cluster state and related information in etcd. etcd by default listens on port_2380_
for client connections.
Since docker host does not provide automated images update, you can use Watchtower to update images automatically when they are pushed to your image registry.
docker run -d \--name watchtower \-e REPO_USER=username \-e REPO_PASS=password \-v /var/run/docker.sock:/var/run/docker.sock \containrrr/watchtower container_to_watch --debug
Developing containerized workloads nowadays is a primary mode of server-side software development. Whether you are working on a web app, API, batch job or service, chances are that at some point you will add “Dockerfile” to your project.
When this happens, hopefully, you’ve bookmarked this article and will find here inspiration and guidance to do things right from the start.