Managing the health and well being of multiple pods with dependencies Managing the health and well being of multiple pods with dependencies kubernetes kubernetes

Managing the health and well being of multiple pods with dependencies


If these are so tightly dependant on each other, I would consider these optionsa) Rearchitect your system to be more resilient towards failure and tolerate, if a pod is temporary unavailableb) Put all parts into one pod as separate containers, making the atomic design more explicit

If these don't fit your needs, you can use the Kubernetes API to create a program that automates the task of restarting all dependent parts. There are client libraries for multiple languages and integration is quite easy. The next step would be a custom resource definition (CRD) so you can manage your own system using an extension to the Kubernetes API.


First thing to do is making sure that pods are started in correct sequence. This can be done using initContainers like that:

spec:  initContainers:  - name: waitfor    image: jwilder/dockerize    args:    - -wait    - "http://config-srv/actuator/health"    - -wait    - "http://registry-srv/actuator/health"    - -wait    - "http://rabbitmq:15672"    - -timeout    - 600s

Here your pod will not start until all the services in a list are responding to HTTP probes.

Next thing you may want to define liveness probe that periodically executes curl to the same services

  spec:    livenessProbe:      exec:        command:        - /bin/sh        - -c        - curl http://config-srv/actuator/health &&          curl http://registry-srv/actuator/health &&          curl http://rabbitmq:15672

Now if any of those services fail - you pod will fail liveness probe, be restarted and wait for services to become back online.

That's just an example how it can be done. In your case checks can be different of course.