How does statefulset and headless service works-K8s How does statefulset and headless service works-K8s kubernetes kubernetes

How does statefulset and headless service works-K8s


Before trying to answer some of your questions I must add disclaimer: there are different ways to skin a cat. And since we are discussing StatefulSets here note that not all approaches are best suited for all stateful applications. In case that you need a single database pod with single PV, you could have one approach, if your api pod needs some shared and some separated PV then another and so on..

Persistence storage in Apps - Why should I consider to deploy postgress (for example) as StatefulSet? I can define PVs and PVC in Deployement to store the data in PV.

This hold true if all your pods are using same persistent volume claim across all replicas (and provisioner allows that). If you try to increase number of replicas based on Deployment all your pods will use the very same PVC. On the other hand, StatefulSet as defined in api documentation has volumeClaimTemplates allowing for each replica to have own generated PVC, securing separately provisioned PV for each pod in replica set.

Now why should I consider Headless Service for StatefulSet apps?

Because of ease of discovery. Again, you don't need to know how many replicas you have in Headless Service, checking service DNS you will get ALL replicas (caveat - that are up and running in that moment). You can do it manually, but in this case you rely on different mechanism of counting/keeping tabs on replicas (replicas are self registered to master for example). Here is nice example of pod discovery with nslookup that can shed some light on why headless can be a nice idea.

Why those(or some other) are important to deploy as StatefulSet

To my understanding, very Operators you listed are deployed using the Deployment themselves. They handle StatefulSets though, so lets consider ElasticSearch for example. If it was not deployed as StatefulSet you would end up with two pods targeting same PV (if provisioner allows it) and that would heavily mess up things. With StatefulSet each pod gets its very own persistent volume claim (from template) and consequently separate persistent volume from other ElasticSearch pods in same StatefulSet. This is just a tip of the iceberg since ElasticSearch is more complex for setup/handling and operators are helping with that.

Why/When should I care about StatefulSet and Headless Serivice?

  • Stateful set you should use in any case where replicated pods need to have separate PV from each other (created from PVC template, and automatically provisioned).

  • Headless Service you should use in any case where you want to automatically discover all pods under the service as opposed to regular Service where you get ClusterIP instead. As an illustration from above mentioned example here is difference between DNS entries for Service (with ClusterIP) and Headless Service (without ClusterIP):

    • Standard service - you will get the clusterIP value:

      kubectl exec zookeeper-0 -- nslookup zookeeperServer:        10.0.0.10Address:    10.0.0.10#53Name:    zookeeper.default.svc.cluster.localAddress: 10.0.0.213
    • Headless service - you will get the IP of each Pod:

      kubectl exec zookeeper-0 -- nslookup zookeeperServer:        10.0.0.10Address:    10.0.0.10#53Name:    zookeeper.default.svc.cluster.localAddress: 172.17.0.6Name:    zookeeper.default.svc.cluster.localAddress: 172.17.0.7Name:    zookeeper.default.svc.cluster.localAddress: 172.17.0.8