One of the great advantage when using containers is the portability. In general, the properties of container base image greatly affects its size. However, since we compare container images with virtual machine images, containers becomes the clear winner. This creates the illusion that the size of the containers is irrelevant.
Have you ever wondered how large your containers are and can we make them smaller in size up to several factors? You might argue what’s the big deal in doing so. Let’s find out.
This article is focused on discussing the advantages in reducing the container size and what are the possible ways in reducing it.
One of the main concerns for container size increase is that we use the default base docker images. This not only makes containers large but also it can potentially duplicate dependencies both in the container image as well as in the application (If the application installs them locally).
Another reason for this is if we use the build tools within the container, of the running application. This increases the size of the container and installs dependencies that might not be needed for the application to runtime. Some of these container images could be reduced by several factors if you are really concerned on its size.
Using Alpine Linux as the base image makes the containers smaller than most distribution base images (~5MB), which leads to smaller image sizes in general. Starting with a slimmer base image like Alpine and build the base image you want with the required dependencies only, makes the application container both minimal in size and having the dependencies under your radar.
In addition, when building the container images and compiling code, you can either use a different container or do the build process outside the running container and copy the compiled/interpreted code to the required directories of the application container. This way it is possible to keep the application container with minimum amounts of tools (Without requiring compilers & etc.).
You can theoretically use a feature rich container image as the base and remove the unwanted dependencies. However, this may cause many challenges specially when updating the image for newer versions for security and various updates where it can cause instability.
There are several advantages in reducing the size of the containers. Following sections discuss these advantages and overall it is important to keep these points in mind if you are building a container cluster.
Generally, using smaller container images comes with a lesser number of libraries inside. This reduces the attack surface to the container. In addition, when we are building images using this approach, we are more transparent whats happening inside.
When we look at the scan log of the container images in Docker Hub, it is evident that larger images are more frequently getting sick due to more software vulnerabilities.
Smaller containers, makes them moving much easier and faster. This improves the performance of the build and deployment processes since less container image data needs to be pulled to the running container cluster. In general, smaller containers are also efficient in utilizing mostly disk space and memory.
This is another advantage with small containers. When you use a Alpine based container image and install the dependencies to it, you are in full control in modifying them later since the configurations modified is known to your application. Since there are less dependencies and libraries installed, it also simplifies managing these libraries, keeping them up to date with operating system patches and etc.
Container size affects application life-cycle in many aspects. Making them small improves these containers in security, performance, efficiency, and maintainability of a containerized application. Although making the container sizes small is not a must, at least trying to build the application container using a minimal base image like Alpine greatly helps in understanding the details of individual container dependencies. This builds confidence for the developers over the software running inside for investigating potential issues as well as to optimize them in future.