Monitoring pod resource usage on Kubernetes nodes Monitoring pod resource usage on Kubernetes nodes kubernetes kubernetes

Monitoring pod resource usage on Kubernetes nodes


This is a known issue as there is still an open github issue and the community is requesting developers to create a new command which would show pod/container total CPU and memory usage. Please check this link as there are some ideas and workarounds provided by the community which look like they could be useful for your case.

Did you use proper metrics and you were not able to see the required information? Here is a list of pod metrics and I think some of them would be useful for your use case.

Even though there is no fully functional solution to this issue thanks to the community and some other resources there are a couple ways of achieving your goal:As advised in this article:

kubectl get nodes --no-headers | awk '{print $1}' | xargs -I {} sh -c 'echo {}; kubectl describe node {} | grep Allocated -A 5 | grep -ve Event -ve Allocated -ve percent -ve -- ; echo'

Also the author of this article recommends CoScale I haven't used it but it seems it is worth a try if other solutions fail.

I think that another point is that you might never be in control if your developers keep allocating far greater resources than it is needed. Solution recommended by Nicola Ben would help you mitigate issues like this.


I ended up writing an own prometheus exporter for this purpose. While node exporter provides usage statistics and kube state metrics exposes metrics about your kubernetes resource objects it's not easy to combine and aggregate these metrics so that they provide valuable information to solve the described use case.

With Kube Eagle (https://github.com/google-cloud-tools/kube-eagle/) you can easily create such a dashboard (https://grafana.com/dashboards/9871):

Grafana dashboard for Kubernetes resource monitoring

I also wrote a medium article about how this has helped me saving lots of hardware resources: https://medium.com/@martin.schneppenheim/utilizing-and-monitoring-kubernetes-cluster-resources-more-effectively-using-this-tool-df4c68ec2053


If you can, I suggest you to use a LimitRange and ResourceQuota resources, for example:

apiVersion: v1kind: ResourceQuotametadata:  name: happy-developer-quotaspec:  hard:    requests.cpu: 400m    requests.memory: 200Mi    limits.cpu: 600m    limits.memory: 500Mi

For LimitRange:

 apiVersion: v1 kind: LimitRange metadata:   name: happy-developer-limit spec:   limits:   - default:       cpu: 600m       memory: 100Mib     defaultRequest       cpu: 100m       memory: 200Mib     max:       cpu: 1000m       memory: 500Mib     type: Container

This prevents people from creating super tiny or super large containers inside the default namespace.