unable to deploy EFK stack on kubernetes (using kubespray)
is my configuration correct ? must i use PV + storageclass + volumeClaimTemplates ? thank you in advance.
Apart from what @Arghya Sadhu already suggested in his answer, I'd like to highlight one more thing in your current setup.
If you're ok with the fact that your Elasticsearch Pods
will be scheduled only on one particular node (in your case your master node), you can still use the local volume type. Don't confuse it however with hostPath. I noticed in your PV
definition that you used hostPath
key so chances are that you're not completely aware of the differences between this two concepts. Although they are quite similar, local type has bigger capabilities and some undeniable advantages over hostPath
.
As you can read in documentation:
A local volume represents a mounted local storage device such as a disk, partition or directory.
So it means that apart from specific directory you're also able to mount local disk or partition (/dev/sdb
, /dev/sdb5
etc.). It can be e.g. an LVM partition with strictly defined capacity. Keep in mind that when it comes to mounting a local directory you are not able to enforce the capacity that can be actually used, so even if you define let's say 5Gi
, logs can be written to your local directory even if this value is exceeded. But it's not the case with logical volume
as you're able to define it's capacity and make sure it won't use more disk space than you gave it.
Second difference is that:
Compared to
hostPath
volumes,local
volumes can be used in a durable and portable manner without manually scheduling Pods to nodes, as the system is aware of the volume’s node constraints by looking at the node affinity on the PersistentVolume.
In this case it is the PersistentVolume
where you define your node affinity, so any Pod
(it can be Pod
managed by your StatefulSet
) which uses subsequently local-storage
storage class and corresponding PersistenVolume
will be automatically scheduled on the right node.
As you can read further, nodeAffinity
is actually the required field in such PV
:
PersistentVolume
nodeAffinity
is required when using local volumes. It enables the Kubernetes scheduler to correctly schedule Pods using local volumes to the correct node.
As far as I understand, your kubernetes cluster is set up locally/on-premise. In this case NFS could be a right choice.
If you used some cloud environment then you could use persistent storage offered by your particular cloud provider e.g. GCEPersistentDisk
or AWSElasticBlockStore
. The full list of persistent volume types currently supported by kubernetes you can find here.
So again, If you're concerned about node-level redundancy in your StatefulSet
and you would like your 2 Elasticsearch Pods
to be scheduled always on different nodes, as @Arghya Sadhu already suggested, use NFS or some other non-local storage.
However if you're not concerned about node-level redundancy and you're totally ok with fact that both your Elasticsearch Pods
are running on the same node (master node in your case), please follow me :)
As @Arghya Sadhu rightly pointed out:
Even if a PV which is already bound to a PVC have spare capacity it can not be again bound to another PVC because it's one to one mapping between PV and PVC.
Although it's always one to one mapping between PV
and PVC
, it doesn't mean you cannot use a single PVC
in many Pods.
Note, that in your StatefulSet
example you used volumeClaimTemplates
which basically means that each time when a new Pod
managed by your StatefulSet
is created, also a new corresponding PersistentVolumeClaim
is created based on this template. So if you have e.g. 10Gi
PersistentVolume
defined, no matter if you request in your claim all 10Gi
or only half of it, only first PVC
will be successfully bound to your PV
.
But instead of using volumeClaimTemplates
and creating a separate PVC
for every stateful Pod
you can make them use a single, manually defined PVC
. Please take a look at the following example:
First thing we need is a storage class. It looks quite similar to the one in your exaple:
apiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: local-storageprovisioner: kubernetes.io/no-provisionervolumeBindingMode: WaitForFirstConsumer
First difference between this setup and yours is in PV
definition. Instead of hostPath
we're using here local volume:
apiVersion: v1kind: PersistentVolumemetadata: name: example-pvspec: capacity: storage: 10Gi volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Delete storageClassName: local-storage local: path: /var/tmp/test ### path on your master node nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - your-master-node-name
Note that apart from defining local
path, we also defined nodeAffinity
rule that makes sure that all Pods
which get this particular PV
will be automatically scheduled on our master node.
Then we have our manually applied PVC
:
apiVersion: v1kind: PersistentVolumeClaimmetadata: name: myclaimspec: accessModes: - ReadWriteOnce volumeMode: Filesystem resources: requests: storage: 10Gi storageClassName: local-storage
This PVC
can now be used by all (in your example 2) Pods
managed by StatefulSet
:
apiVersion: apps/v1kind: StatefulSetmetadata: name: webspec: selector: matchLabels: app: nginx # has to match .spec.template.metadata.labels serviceName: "nginx" replicas: 2 # by default is 1 template: metadata: labels: app: nginx # has to match .spec.selector.matchLabels spec: terminationGracePeriodSeconds: 10 containers: - name: nginx image: k8s.gcr.io/nginx-slim:0.8 ports: - containerPort: 80 name: web volumeMounts: - name: mypd mountPath: /usr/share/nginx/html volumes: - name: mypd persistentVolumeClaim: claimName: myclaim
Note that in the above example we don't use volumeClaimTemplates
any more but a single PersistentVolumeClaim
which can be used by all our Pods. Pods are still unique as they are managed by a StatefulSet
but instead of using unique PVCs
, they use common one. Thanks to this approach both Pods
can write logs to a single volume at the same time.
In my example I used the nginx server to make the replication as easy as possible for everyone who wants to try it out quickly but I believe you can easily adjust it to your needs.