Kubernetes HPA based on JVM Heap memory Kubernetes HPA based on JVM Heap memory kubernetes kubernetes

Kubernetes HPA based on JVM Heap memory


Scaling Java applications in Kubernetes is a bit tricky. The HPA looks at system memory only and as pointed out, the JVM generally do not release commited heap space (at least not immediately).

There are two main approaches one could take to solve this

1. Tune JVM Parameters so that the commited heap follows the used heap more closely

Depending on which JVM and GC is in use the tuning options may be slightly different, but the most important ones would be

  • MaxHeapFreeRatio - How much of the commited heap that is allowed to be unused
  • GCTimeRatio - How often GC is allowed to run (impacts performance)
  • AdaptiveSizePolicyWeight - How to weigh older vs newer GC runs when calculating new heap

Giving exact values for these are not easy, it is a compromise between releasing memory fast and application performance. The best settings will be dependant on the load characteristics of the application.

Patrick Dillon has written an article published by RedHat called Scaling Java containers that deep dives into this subject.

2. Custom scaling logic

Instead of using the HPA you could create your own scaling logic and deploy it into Kubernetes as a job running periodically to do:

  1. Check the heap usage in all pods (for example by running jstat inside the pod)
  2. Scale out new pods if the max threshold is reached
  3. Scale in pods if the min threshold is reached

This approach has the benefit of looking at the actual heap usage, but requires a custom component.

An example of this can be found in the article Autoscaling based on CPU/Memory in Kubernetes — Part II by powercloudup