AutoScaling work loads without running out of memory AutoScaling work loads without running out of memory kubernetes kubernetes

AutoScaling work loads without running out of memory


The pending pods are what one should monitor and define Resource requests which affect scheduling.

The Scheduler uses Resource requests Information when scheduling the podto a node. Each node has a certain amount of CPU and memory it can allocate topods. When scheduling a pod, the Scheduler will only consider nodes with enoughunallocated resources to meet the pod’s resource requirements. If the amount ofunallocated CPU or memory is less than what the pod requests, Kubernetes will notschedule the pod to that node, because the node can’t provide the minimum amountrequired by the pod. The new Pods will remain in Pending state until new nodes come into the cluster.

Example:

apiVersion: v1kind: Podmetadata:name: requests-podspec: containers: - image: busybox   command: ["dd", "if=/dev/zero", "of=/dev/null"]   name: main   resources:     requests:       cpu: 200m       memory: 10Mi

When you don’t specify a request for CPU, you’re saying you don’t care how muchCPU time the process running in your container is allotted. In the worst case, it maynot get any CPU time at all (this happens when a heavy demand by other processesexists on the CPU). Although this may be fine for low-priority batch jobs, which aren’ttime-critical, it obviously isn’t appropriate for containers handling user requests.


Short answer: add resources requests but don't add limits. Otherwise, you will face the throttling issue.