A non-technical explanation of Docker containers, Kubernetes, and clusters.
If you work in tech, you likely encountered the terms containers, Kubernetes, and clusters. They may be clear if you’re a software engineer, but if you’re not, they can be intimidating.
While wrapping my own head around all of this, I came up with an analogy that explains these terms and their relationship. However, before we delve into it, let’s look at an overview of a modern data center.
The Modern Data Center
Whether you operate on-premise or on-the-cloud, a modern data center contains similar key components, represented as layers.
1) Physical layer
All data centers have different physical resources mounted on racks. These could be compute resources with different processing units such as CPUs, GPUs, or TPUs. These could also be storage resources and disks for saving data.
2) Management layer
Physical resources can be split into clusters, a virtual grouping of resources. A common configuration is to have development, and production clusters to separate operational environments. This way engineers can freely use the development cluster to test new ideas while users can work uninterrupted on the production cluster.
The process of managing resource requests and allocations is called orchestration. This is the main function of Kubernetes. It handles incoming workloads to a cluster, assigns appropriate resources, and optimizes utilization.
3) Application layer
These are the services or pieces of software that run on the infrastructure. This ranges from launching one-time computational jobs, or running continuous services to enable users to access a service anytime they want.
To allow multiple services to run simultaneously, clusters are broken down to smaller units called pods. A pod is a subset of cluster resources grouped to run an application.
Nowadays, almost any piece of software will have dependencies. Whether it’s a specific OS version, library, or utility that it calls. It can’t function if it can’t find these dependencies during run-time. This is where Docker containers save the day. They package your code and all its dependencies into a single unit called a container. Now, your software can seamlessly run on any infrastructure that supports containers.
(Original illustration by Yehia “Yaya” Khoja)
Our analogy of the modern cluster is that it operates like a hotel.
A hotel has many rooms with different amenities. A guest makes a reservation to request a room that matches their needs. At check-in, the guest presents their reservation to the hotel reception. The reception confirms the reservation, makes sure the room is available, and then assigns the guest a room based on their reservation. Finally, the guest proceeds to the room with their luggage, and enjoys their stay.
Now, let’s map the roles in the hotel to the modern cluster
- Hotel = Data center = Cluster (we’ll assume one cluster in the data center)
- Hotel room = Pod
- Hotel guest = Docker container (we’ll just say container)
- Guest’s luggage = Software files & dependencies (we’ll just say code)
- Reception = Kubernetes (a.k.a. K8s)
- Reservation = Kubernetes spec (we’ll just say spec)
The Grand Cluster Hotel
(Original illustration by Christine Kim)
Imagine a guest (container) wants to stay at The Grand Cluster Hotel. They’ll first need a reservation (spec) that specifies the type and amount of resources needed. The guest is now ready to take their luggage (code) and head to the hotel.
Once there, reception (Kubernetes) checks the reservation (spec), and looks for availability. It assigns an available room (pod) that matches the request. Finally, the guest (container) takes their luggage (code) and heads to the assigned room (pod) where it will be staying (hosted).
Like all analogies, it breaks down if you push it too much. So we only explained the simple case of a single cluster in the data center with an application to launch on it. But I’d like to point out things that were either excluded or inaccurately captured by our analogy.
- A pod is created in real-time. Unlike, a hotel where rooms are pre-built based on certain layouts, a pod is custom created per your spec when you submit your request to the cluster.
- A pod can have multiple containers in it. This actually works with our analogy because you can have multiple guests staying in a room.
- An application can be a suite of services with each service running on a separate pod. For example a simple e-commerce application can have an inventory management service, payment processing service, and a user authentication service.
- A single service can run on multiple pods for high availability. This way if a pod fails, the service can continue to run on the redundant pod.Containers are a general technology.
- Docker is only one “brand” of containerization, but it is by far the most popular. This is similar to search which is dominated by Google, but you also have Bing.
I hope you enjoyed this brief explanation. All feedback and suggestions are welcome!