Long GC Pause on Apache Spark Structured Streaming on Kubernetes Long GC Pause on Apache Spark Structured Streaming on Kubernetes kubernetes kubernetes

Long GC Pause on Apache Spark Structured Streaming on Kubernetes


So I should have replied to this sooner while the solution was fresh in my mind, but I ended up doing a few things that contributed to decreasing the garbage collection time. I don't remember all of the documentation sources that contributed to me being able to resolve this, but I spent a lot of time researching on SO, the gceasy recommendations, and general Java GC literature. Anyway here's what ended up helping:

  • Limited the number of cores that participate in a full GC event: I believe this was the biggest contributor to increased performance. I noticed that certain executors would have large GC times during a given micro-batch, and other executors on the same kubernetes VM would have large computation times that were close to (if not exactly) the duration of the GC pause. This correlation led me down a research path where I eventually discovered that the JVM (at least for Java 8) gets its defaults for the GC from the underlying kubernetes VM rather than the limited resources dedicated to the container on which the JVM runs. Since each container had a different instance of the JVM, each executor had default GC parameters assuming it was the only JVM running on the underlying kubernetes VM. The GC parameter that specifies the number of threads available for a Full GC event is ParallelGCThreads. This is set by default by the JVM as a percentage of the total number of cores on the VM. For a 32 core kubernetes VM, it ended up being 23, if I recall correctly. So when a Full GC event occurred, the GC would cause contention on the CPUs being used by the other executors which were conducting normal computations. My theory is that this contention was pushing up the GC/computation runtimes that occurred on the same underlying kubernetes VM. For my particular test, I ended up overriding the default parameters for ConcGCThreads (to 1) and ParallelGCThreads( to 5) since I was running 6 executors per 32 core kubernetes VM.
  • Increased the memory on each executor: The gceasy graphs never really showed the memory plateau. It only increased as the pipeline continued to run. I ended up increasing the memory dedicated from each executor to ~15 GB from 8 GB and was getting plateaus around ~10 GB after that. The actual amount of memory you need will probably depend on your code.
  • Enabled string de-duplication: Most of my dataset was strings so this helped decrease the overall memory foot print for my application
  • Modified the initial heap occupancy: This was recommended in gceasy as well as some SO threads.

So here are the final set of JVM parameters I am using after all that. I hope this helps.

-XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:InitiatingHeapOccupancyPercent=35 -XX:+UseStringDeduplication -XX:ConcGCThreads=1 -XX:ParallelGCThreads=5