Kubernetes can be complex to use, with many potential issues affecting the integrity of your code. Troubleshooting Kubernetes can also be challenging. For example, you might easily identify issues such as an unavailable container cluster or unresponsive pod. However, it might be harder to determine the cause and resolve the issue.
This article outlines some common troubleshooting scenarios in Kubernetes and how you can address them.
The following are the most common coding errors in Kubernetes.
Exit Code 1
This error code indicates that the termination of a container was the result of an invalid reference or application error:
Application errors—these range from simple programming errors in the code the container runs (e.g., “divide by zero”) to advanced runtime environment-related errors (e.g., Python, Java, etc.).
Invalid references—these occur when a file referred to by the image specification is not located in the relevant container image.
Solution:
If you encounter Error Code 1, implement the following steps:
This error code indicates that the container failed to run. It occurs when Kubernetes invokes a command in the system shell and fails to execute it properly. For example, you might use the docker
run
command, but you don’t manage to run it. Common causes for Exit Code 125 include:
docker run --abcd
.Solution:
If your container was terminated with Exit Code 125, use the following steps:
docker start
command instead of docker run
in Docker.This error code indicates a failure to invoke the command in your container specification. Typical causes for command invoke errors include missing dependencies and flaws in the continuous integration script running the container.
Solution:
If your container terminates with Exit Code 126, implement the following steps:
These errors affect Kubernetes PersistentVolumeClaims (PVCs), which are complex mechanisms prone to hard-to-identify errors. A PVC enables a pod to mount a
Different PVC issues can occur at various stages of the persistent volume lifecycle. Examples of common errors in this category:
DaemonSets are considered unhealthy when they don’t have exactly one pod per node. DaemonSets are often unhealthy due to pending pods or pods stuck in crash loops. Daemon set errors often result from the nodes scheduled to run the pods.
A pod may experience a crash loop for various reasons, such as a lack of resources. Check the specification to identify resources that you can increase—for example, increasing memory or CPU and limiting values may enable pods to run for longer. You can check a pod’s logs to troubleshoot it fully. If there is no apparent issue with resource usage, you should check the pod’s command. If the container terminates before it is supposed to, look for the image used in the specification to verify it is correct.
If one or multiple pods in a DaemonSet are pending, this may indicate that there are insufficient resources for scheduling a pod on every node. You can use the following steps to resolve this issue:
You can prevent DaemonSets from running on specific nodes by modifying the taints of each node or tolerations of a DaemonSet. This approach helps prevent DaemonSets from scheduling pods to specialized nodes that might not have the required resources.
Suppose you don’t require DaemonSet functionality (i.e., one pod per node). In that case, you might use a Deployment instead—this option offers greater flexibility to determine the number of pods on their location.
In this article, I covered the most common Kubernetes coding errors and what you can do about them:
Exit Code 1—application issues and invalid references stemming from an error in the image specification or an issue in an application running in a container.
I hope this will be useful as you improve the quality and reliability of your Kubernetes clusters.