How to prevent scale down of newly scaled up pod for specific period of time which was created by HPA in Kubernetes? How to prevent scale down of newly scaled up pod for specific period of time which was created by HPA in Kubernetes? docker docker

How to prevent scale down of newly scaled up pod for specific period of time which was created by HPA in Kubernetes?


Yes we can do this. I currently doing this experimentation almost related to your question.

Try to find Following things while autoscaling.

  1. Time taken for HPA to calculate Replica needed
  2. Time taken for pod to Spin up.
  3. Time taken to Droplet spin up.
  4. Time taken for pods spin down.
  5. Time taken to Droplet Spin down.

Case 1: Time taken for HPA to calculate Replica needed (HPA)

HPA detect the changes, As soon as get metrics immediately or atleast within 15 secs. Depends on horizontal-pod-autoscaler-sync-period By default it is set to 15 secs. As soon HPA get Metric, it calculates Replica Needed.

Case 2: Time taken for pod to Spin up. (HPA)

As soon as HPA calculate Desired Replicas, Pods start spin up. But it depends on ScaleUp Policy. You can set this as per your use case.And also depend on Droplet available, cluster autoscaler

For Example:You can tell HPA, Hey, please spin up 4 pods in 15 secs OR Spin up 100 % of current available pods in 20 secs.

Now HPA, will take decision to select anyone policy, which make more impact(Most changes in replica count). If 100% pods > 4 pods ,Second policy takeover, otherwise first Policy can take over.Process repeats until reach the desried replica.

If you need scaled up Pod count immediately, you set policy as spin up 100 % pods in 1 secs, hence it try to spin up 100 % of current replica count in every secs until match the Desired Replica count.

Case 3: Time taken to Droplet spin up. (Cluster Autoscaler)

Time Taken For:

  • Cluster autoscaler to detect pending pods and start spinning droplet: 1 min 05 secs (approx)
  • Droplet spin up , but Not Ready State: 1 min 20 secs
  • Droplet to each READY STATE: 10 - 20 secs

Total Time taken to droplet Available: 2 min 40 secs (approx)

Case 4: Time taken for pod to spin down. (HPA)

It depends on ScalDown Policy, as like as Case 2.

Case 5: Time taken to Droplet Spin down. (Cluster Autoscaler)

After all the Target pods terminated from the Droplet(Time taken depends on case 4).

Digital Ocean set Taints to node like DeletionCandidate...=<timestamp>:NopreferSchedule

Ten mins from taint set, droplet starts spin down.

Conclusion:

If you need Node for one hour to stay alive (utilize as max because of hourly charge)And Not cross one hour(if above 1 hr, it billed as 2 hr)

You can set, StabilizatioWindowSeconds = 1 hr - DigitalOcean Time Interval to delete

Theoretically,StabilizatioWindowSeconds = 1 hr - 10 mins = 50 mins (3000 secs)

PracticallyTime taken for all Pods to terminate may vary depend on the scale down policy, your application etc...

So I set approx(according to my case)StabilizatioWindowSeconds = 1 hr - 20 mins = 40 mins (2400 secs)

Thus, your Scaled up pods can now alive for 40 mins, And starts terminating after 40 mins (In my case all pods terminated within max of 5 mins). So balance 15 mins for digital ocean to destroy the droplet.

CAUTION: Time calculated are depending on my use case and environment etc..

Add HPA behavior config for reference

behavior:    scaleDown:      stabilizationWindowSeconds: 2400      selectPolicy: Max      policies:      - type: percent        value: 100        periodSeconds: 15    scaleUp:      stabilizationWindowSeconds: 0      selectPolicy: Max      policies:      - type: Percent        value: 100        periodSeconds: 1