How to get metrics of bunches of short-lived Kubernetes jobs How to get metrics of bunches of short-lived Kubernetes jobs kubernetes kubernetes

How to get metrics of bunches of short-lived Kubernetes jobs


As you wrote you used prometheus, pushgateways, metrics-server ns query /api/v1/nodes/{nodeName}/proxy/metrics/cadvisor if they don't satisfy you enough new approach which I recommend of montitoring and metrics saving of cluster performance is Litmus.

Prometheus is most common and complex tool which may be used by most of engineers but Litmus is kind new tool which is focused on workload testing, metrics are saved and you can store them as long as you want.

More information you can find here: litmus.

Useful artice: litmus-openebs, this describe not to get metrics not only like memory usage.

Then you can generate charts in egg. gnuplot.