Kubernetes liveness - Reserve threads/memory for a specific endpoint with Spring Boot Kubernetes liveness - Reserve threads/memory for a specific endpoint with Spring Boot kubernetes kubernetes

Kubernetes liveness - Reserve threads/memory for a specific endpoint with Spring Boot


The actuator health endpoint is very convenient with Spring boot - almost too convenient in this context as it does deeper health checks than you necessarily want in a liveness probe. For readiness you want to do deeper checks but not liveness. The idea is that if the Pod is overwhelmed for a bit and fails readiness then it will be withdrawn from the load balancing and get a breather. But if it fails liveness it will be restarted. So you want only minimal checks in liveness (Should Health Checks call other App Health Checks). By using actuator health for both there is no way for your busy Pods to get a breather as they get killed first. And kubernetes is periodically calling the http endpoint in performing both probes, which contributes further to your thread usage problem (do consider the periodSeconds on the probes).

For your case you could define a liveness command and not an http probe - https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-liveness-command. The command could just check that the Java process is running (so kinda similar to your go-based probe suggestion).

For many cases using the actuator for liveness would be fine (think apps that hit a different constraint before threads, which would be your case if you went async/non-blocking with the reactive stack). Yours is one where it can cause problems - the actuator's probing of availability for dependencies like message brokers can be another where you get excessive restarts (in that case on first deploy).


I have a prototype just wrapping up for this same problem: SpringBoot permits 100% of the available threads to be filled up with public network requests, leaving the /health endpoint inaccessible to AWS load balancer which knocks the service offline thinking it's unhealthy. There's a different between unhealthy and busy... and health is more than just a process running, port listening, superficial check, etc - it needs to be a "deep ping" which checks that it and all its dependencies are operable in order to give a confident health check response back.

My approach to solving the problem is to produce two new auto-wired components, the first to configure Jetty with a fixed, configurable maximum number of threads (make sure your JVM is allocated enough memory to match), and the second to keep a counter of each request as it starts and completes, throwing an Exception which maps to an HTTP 429 TOO MANY REQUESTS response if the count approaches a ceiling which is the maxThreads - reserveThreads. Then I can set reserveThreads to whatever I want and the /health endpoint is not bound by the request counter, ensuring that it's always able to get in.

I was just searching around to figure out how others are solving this problem and found your question with the same issue, so far haven't seen anything else solid.

To configure Jetty thread settings via application properties file:http://jdpgrailsdev.github.io/blog/2014/10/07/spring_boot_jetty_thread_pool.html


Sounds like your Microservice should still respond to health checks /health whilist returning results from that 3rd service its calling.

I'd build an async http server with Vert.x-Web and try a test before modifying your good code. Create two endpoints. The /health check and a /slow call that just sleeps() for like 5 minutes before replying with "hello". Deploy that in minikube or your cluster and see if its able to respond to health checks while sleeping on the other http request.