Recently, while working on a workshop titled Testing Your Pull Request on Kubernetes with GKE and GitHub Actions, I faced the same issue twice: service A needs service B, but service A starts faster than service B, and the system fails. In this post, I want to describe the context of these issues and how I solved them both with the same tool. Waiting in Kubernetes It might sound strange to wait in Kubernetes. The self-healing nature of the Kubernetes platform is one of its biggest benefits. Let's consider two pods: a Python application and a PostgreSQL database. The application starts very fast and eagerly tries to establish a connection to the database. Meanwhile, the database is initializing itself with the provided data; the connection fails. The pod ends up in the Failed state. After a while, Kubernetes requests the application pod's state. Because it failed, it terminates it and starts a new pod. At this point, two things can happen: the database pod isn't ready yet, and it's back to square one, or it's ready, and the application finally connects. To speed up the process, Kubernetes offers startup probes: startupProbe: httpGet: path: /health port: 8080 failureThreshold: 30 periodSeconds: 10 With the above probe, Kubernetes waits for an initial ten seconds before requesting the pod's status. If it fails, it waits for another ten seconds. Rinse and repeat 30 times before it definitely fails. You may have noticed the HTTP /health endpoint above. Kubernetes offers two exclusive Probe configuration settings: httpGet or exec. The former is suitable for web applications, while the latter is for other applications. It implies we need to know which kind of container the pod contains and how to check its status, provided it can. I'm no PostgreSQL expert, so I searched for a status check command. The Bitnami Helm Chart looks like the following when applied: startupProbe: exec: command: - /bin/sh - -c - -e - exec pg_isready -U $PG_USER -h $PG_HOST -p $PG_PORT Note that the above is a simplification, as it gladly ignores the database name and an SSL certificate. The startup probe speeds things up compared to the default situation if you configure it properly. You can set a long initial delay, and then shorter increments. Yet, the more diverse the containers, the harder it gets to configure, as you need to be an expert in each of the underlying containers. It would be beneficial to look for alternatives. Wait4x Alternatives are tools whose focus is on waiting. A long time ago, I found the wait-for script for this. The idea is straightforward: ./wait-for is a script designed to synchronize services like docker containers. It is sh and alpine compatible. Here's how to wait for an HTTP API: sh -c './wait-for http://my.api/health -- echo "The api is up! Let's use it"' It got the job done, but at the time, you had to copy the script and manually check for updates. I've checked, and the project now provides a regular container. wait4x plays the same role, but is available as a versioned container and provides more services to wait for: HTTP, DNS, databases, and message queues. That's my current choice. Whatever tool you use, you can use it inside an init container: A Pod can have multiple containers running apps within it, but it can also have one or more init containers, which are run before the app containers are started. Init containers are regular containers, except: Init containers always run to completion. Each init container must complete successfully before the next one starts. Imagine the following Pod that depends on a PostgreSQL Deployment: apiVersion: v1 kind: Pod metadata: labels: type: app app: recommandations spec: containers: - name: recommandations image: recommandations:latest envFrom: - configMapRef: name: postgres-config The application is Python and starts quite fast. It attempts to connect to the PostgreSQL database. Unfortunately, the database hasn't finished initializing, so the connection fails, and Kubernetes restarts the pod. We can fix it with an initContainer and a waiting container: apiVersion: v1 kind: Pod metadata: labels: type: app app: recommandations spec: initContainers: - name: wait-for-postgres image: atkrad/wait4x:3.1 command: - wait4x - postgresql - postgres://$(DATABASE_URL)?sslmode=disable envFrom: - configMapRef: name: postgres-config containers: - name: recommandations image: recommandations:latest envFrom: - configMapRef: name: postgres-config In the above setup, the initContainer doesn't stop until the database accepts connections. When it does, it terminates, and the recommandations container can start. Kubernetes doesn't need to terminate the Pod as in the previous setup! It entails fewer logs and potentially fewer alerts. When Waiting Becomes Mandatory The above is a slight improvement, but you can do without it. In other cases, waiting becomes mandatory. I experienced it recently when preparing for the workshop mentioned above. The scenario is the following: The pipeline applies a manifest on the Kubernetes side In the next step, it runs the test As the test starts before the application is read, it fails. We must wait until the backend is ready before we test. Let's use wait4x to wait for the Pod to accept requests before we launch the tests: - name: Wait until the application has started uses: addnab/docker-run-action@v3 #1 with: image: atkrad/wait4x:latest run: wait4x http ${{ env.BASE_URL }}/health --expect-status-code 200 #2 The GitHub Action allows running a container. I could have downloaded the Go binary instead. Wait until the /health endpoint returns a 200 response code. Conclusion Kubernetes startup probes are a great way to avoid unnecessary restarts when you start services that depend on each other. The alternative is an external waiting tool configured in an initContainer. wait4x is a tool that can be used in other contexts. It's now part of my toolbelt. To go further: wait4x So you need to wait for some Kubernetes resources? Originally published at A Java Geek on April 20th, 2025 Recently, while working on a workshop titled Testing Your Pull Request on Kubernetes with GKE and GitHub Actions , I faced the same issue twice: service A needs service B, but service A starts faster than service B, and the system fails. In this post, I want to describe the context of these issues and how I solved them both with the same tool. Testing Your Pull Request on Kubernetes with GKE and GitHub Actions Waiting in Kubernetes It might sound strange to wait in Kubernetes. The self-healing nature of the Kubernetes platform is one of its biggest benefits. Let's consider two pods: a Python application and a PostgreSQL database. The application starts very fast and eagerly tries to establish a connection to the database. Meanwhile, the database is initializing itself with the provided data; the connection fails. The pod ends up in the Failed state. Failed After a while, Kubernetes requests the application pod's state. Because it failed, it terminates it and starts a new pod. At this point, two things can happen: the database pod isn't ready yet, and it's back to square one, or it's ready, and the application finally connects. To speed up the process, Kubernetes offers startup probes : startup probes startupProbe: httpGet: path: /health port: 8080 failureThreshold: 30 periodSeconds: 10 startupProbe: httpGet: path: /health port: 8080 failureThreshold: 30 periodSeconds: 10 With the above probe, Kubernetes waits for an initial ten seconds before requesting the pod's status. If it fails, it waits for another ten seconds. Rinse and repeat 30 times before it definitely fails. You may have noticed the HTTP /health endpoint above. Kubernetes offers two exclusive Probe configuration settings: httpGet or exec . The former is suitable for web applications, while the latter is for other applications. It implies we need to know which kind of container the pod contains and how to check its status, provided it can. I'm no PostgreSQL expert, so I searched for a status check command. The Bitnami Helm Chart looks like the following when applied: /health Probe httpGet exec Bitnami Helm Chart startupProbe: exec: command: - /bin/sh - -c - -e - exec pg_isready -U $PG_USER -h $PG_HOST -p $PG_PORT startupProbe: exec: command: - /bin/sh - -c - -e - exec pg_isready -U $PG_USER -h $PG_HOST -p $PG_PORT Note that the above is a simplification, as it gladly ignores the database name and an SSL certificate. The startup probe speeds things up compared to the default situation if you configure it properly. You can set a long initial delay, and then shorter increments. Yet, the more diverse the containers, the harder it gets to configure, as you need to be an expert in each of the underlying containers. It would be beneficial to look for alternatives. Wait4x Alternatives are tools whose focus is on waiting. A long time ago, I found the wait-for script for this. The idea is straightforward: wait-for ./wait-for is a script designed to synchronize services like docker containers. It is sh and alpine compatible. ./wait-for is a script designed to synchronize services like docker containers. It is sh and alpine compatible. ./wait-for sh alpine Here's how to wait for an HTTP API: sh -c './wait-for http://my.api/health -- echo "The api is up! Let's use it"' sh -c './wait-for http://my.api/health -- echo "The api is up! Let's use it"' It got the job done, but at the time, you had to copy the script and manually check for updates. I've checked, and the project now provides a regular container. wait4x plays the same role, but is available as a versioned container and provides more services to wait for: HTTP, DNS, databases, and message queues. That's my current choice. wait4x Whatever tool you use, you can use it inside an init container : init container A Pod can have multiple containers running apps within it, but it can also have one or more init containers, which are run before the app containers are started. Init containers are regular containers, except: Init containers always run to completion. Each init container must complete successfully before the next one starts. A Pod can have multiple containers running apps within it, but it can also have one or more init containers, which are run before the app containers are started. Init containers are regular containers, except: Init containers always run to completion. Each init container must complete successfully before the next one starts. Init containers always run to completion. Each init container must complete successfully before the next one starts. Imagine the following Pod that depends on a PostgreSQL Deployment : Pod Deployment apiVersion: v1 kind: Pod metadata: labels: type: app app: recommandations spec: containers: - name: recommandations image: recommandations:latest envFrom: - configMapRef: name: postgres-config apiVersion: v1 kind: Pod metadata: labels: type: app app: recommandations spec: containers: - name: recommandations image: recommandations:latest envFrom: - configMapRef: name: postgres-config The application is Python and starts quite fast. It attempts to connect to the PostgreSQL database. Unfortunately, the database hasn't finished initializing, so the connection fails, and Kubernetes restarts the pod. We can fix it with an initContainer and a waiting container: initContainer apiVersion: v1 kind: Pod metadata: labels: type: app app: recommandations spec: initContainers: - name: wait-for-postgres image: atkrad/wait4x:3.1 command: - wait4x - postgresql - postgres://$(DATABASE_URL)?sslmode=disable envFrom: - configMapRef: name: postgres-config containers: - name: recommandations image: recommandations:latest envFrom: - configMapRef: name: postgres-config apiVersion: v1 kind: Pod metadata: labels: type: app app: recommandations spec: initContainers: - name: wait-for-postgres image: atkrad/wait4x:3.1 command: - wait4x - postgresql - postgres://$(DATABASE_URL)?sslmode=disable envFrom: - configMapRef: name: postgres-config containers: - name: recommandations image: recommandations:latest envFrom: - configMapRef: name: postgres-config In the above setup, the initContainer doesn't stop until the database accepts connections. When it does, it terminates, and the recommandations container can start. Kubernetes doesn't need to terminate the Pod as in the previous setup! It entails fewer logs and potentially fewer alerts. initContainer recommandations Pod When Waiting Becomes Mandatory The above is a slight improvement, but you can do without it. In other cases, waiting becomes mandatory. I experienced it recently when preparing for the workshop mentioned above. The scenario is the following: The pipeline applies a manifest on the Kubernetes side In the next step, it runs the test As the test starts before the application is read, it fails. The pipeline applies a manifest on the Kubernetes side In the next step, it runs the test As the test starts before the application is read, it fails. We must wait until the backend is ready before we test. Let's use wait4x to wait for the Pod to accept requests before we launch the tests: wait4x Pod - name: Wait until the application has started uses: addnab/docker-run-action@v3 #1 with: image: atkrad/wait4x:latest run: wait4x http ${{ env.BASE_URL }}/health --expect-status-code 200 #2 - name: Wait until the application has started uses: addnab/docker-run-action@v3 #1 with: image: atkrad/wait4x:latest run: wait4x http ${{ env.BASE_URL }}/health --expect-status-code 200 #2 The GitHub Action allows running a container. I could have downloaded the Go binary instead. Wait until the /health endpoint returns a 200 response code. The GitHub Action allows running a container. I could have downloaded the Go binary instead. Wait until the /health endpoint returns a 200 response code. /health 200 Conclusion Kubernetes startup probes are a great way to avoid unnecessary restarts when you start services that depend on each other. The alternative is an external waiting tool configured in an initContainer . wait4x is a tool that can be used in other contexts. It's now part of my toolbelt. initContainer wait4x To go further: To go further: wait4x So you need to wait for some Kubernetes resources? wait4x wait4x So you need to wait for some Kubernetes resources? So you need to wait for some Kubernetes resources? Originally published at A Java Geek on April 20th, 2025 Originally published at A Java Geek on April 20th, 2025 A Java Geek