K8S - HPA without Deployment Object K8S - HPA without Deployment Object kubernetes kubernetes

K8S - HPA without Deployment Object


A horizontal pod autoscaler in a logical sense is a control loop that checks it's specified target (such as a Deployment) to see if metric thresholds have been met. (This actually occurs through the controller manager and the metrics APIs). The HPA is its own independent resource, and it interacts with the deployment through something called the Scale subresource. This is an abstraction to act as an interface for any future resource that Kubernetes may want to be scaled automatically (not just deployments).

Specification of a Horizontal Pod Autoscaler (taken from k8s design documentation):

type HorizontalPodAutoscalerSpec struct {    // reference to Scale subresource; horizontal pod autoscaler will learn the current resource    // consumption from its status,and will set the desired number of pods by modifying its spec.    ScaleRef SubresourceReference    // lower limit for the number of pods that can be set by the autoscaler, default 1.    MinReplicas *int    // upper limit for the number of pods that can be set by the autoscaler.    // It cannot be smaller than MinReplicas.    MaxReplicas int    // target average CPU utilization (represented as a percentage of requested CPU) over all the pods;    // if not specified it defaults to the target CPU utilization at 80% of the requested resources.    CPUUtilization *CPUTargetUtilization}

Note that the HPA uses the scale ref to target the desired resource, while the Deployment encompasses multiple resources through its selector. HPAs are decoupled from specific deployments for flexibility reasons. This means that when you delete the Deployment, k8s can delete everything that it was managing through its selector. The HPA is not managed by the Deployment, but is only connected to it through its own specification. The HPA can remain, waiting for a new deployment to take the original's place, it can be reconfigured, or it can be deleted.