How to achieve high availability and load balancing in Kubernetes cluster How to achieve high availability and load balancing in Kubernetes cluster kubernetes kubernetes

How to achieve high availability and load balancing in Kubernetes cluster


I want a scenario such that if any node fails, then how can we achieve high availability in this

Creating a Pod directly is not a recommended approach. Lets say that the node on which the Pod is running crashes, then the Pod is not rescheduled and the service is not accessible.

For HA (High Availability), higher level abstractions like Deployments should be used. A Deployment will create a ReplicaSet which will have multiple Pods associated with it. So, if a node on which the Pod is running crashes then the ReplicaSet will automatically reschedule the Pod on a healthy node and you will get HA.

Also , there should be load balancing so that requests get simultaneously directed to other nodes in the cluster.

Create a Service of type LoadBalancer for the Deployment and the incoming requests will be automatically redirected to the Pods on the different nodes. In this case a Load Balancer will be automatically created. And there is charge associated with the Load Balancer.

If you don't want to use a Load Balancer then another approach though which is a bit more complicated and powerful is to use Ingress. This will also load balance the requests across multiple nodes.

Here is a nice article explaining the difference between a Load Balancer and Ingress.

All the above queries are addressed directly or indirectly in the K8S documentation here.