Kubernetes single point of failure and load balancing Kubernetes single point of failure and load balancing kubernetes kubernetes

Kubernetes single point of failure and load balancing


  1. If Kubernetes service a single point of failure or because it is supported by multiple pods on which kube-proxy is configured and services is just a virtual layer, it cannot be considered as single point of failure?

I think your latter interpretation is correct.

  1. Above diagram is a single Kubernetes cluster, is this a single point of failure or should I plan for multiple Kubernetes cluster for system where I need to support HA with zero downtime.

The k8s cluster is not HA, because the master node is the single point of failure. Important components on the master node include the apiserver and controller manager, without them you cannot create more pods or services. That said, your deployed services should continue to work even if the master node is down.

There is a guide on how to set up the k8s cluster in HA mode, I haven't tried it personally though: http://kubernetes.io/docs/admin/high-availability/. Also there is Ubernetes (WIP), which allows you federate multiple k8s clusters accross cloud providers.

  1. Above diagram leverages Kubernetes services which by default supports only L4 Load balancing (round robin only).

This is not true, kubernetes have beta feature called ingress, which support L7 load balancing, see if it helps http://kubernetes.io/docs/user-guide/ingress/ :)


Having successfully gone live with in-premise deployment of 50+ pods (planning to scale to 250+ in near future) of Kubernetes below is my observation from experience me and my team gathered:

  1. If Kubernetes service a single point of failure or because it is supported by multiple pods on which kube-proxy is configured and services is just a virtual layer, it cannot be considered as single point of failure?

It will be single point of failure if Load balancer (LB) is mapped to single node IP as failure of that VM / Physical server will bring down the entire application. Hence point LB to at least 2 different node IPs.

  1. Above diagram is a single Kubernetes cluster, is this a single point of failure or should I plan for multiple Kubernetes cluster for system where I need to support HA with zero downtime.

Configure Kubernetes in HA with a Load Balancer

  1. Above diagram leverages Kubernetes services which by default supports only L4 Load balancing (round robin only). Hence say a tomcat server is heavily loaded, round robin will not distribute load evenly based on usage. How to achieve load distribution based upon system resource consumption or usage or no. of open connections in the above topology?

Yes, only round robin Load balancing is supported as of now. Ingress is in beta stage and not ready for production when I last checked. NGINX+ can be used to load balance ignoring the Kubernetes Load balancing and using Kubernetes APIs it can be configured such that addition or removal of Tomcat Pods is updated in NGINX+ in runtime without any downtime. (I did not try this yet, but might consider in future if current setup throws any challenge)

Refer: https://www.nginx.com/blog/load-balancing-kubernetes-services-nginx-plus/