How does the Kubernetes API server start a newly scheduled pod on a node? How does the Kubernetes API server start a newly scheduled pod on a node? kubernetes kubernetes

How does the Kubernetes API server start a newly scheduled pod on a node?


Alibaba had a really insightful blog post on the inner workings of the scheduler. From the blog:


The scheduler basically works like this:

  • The scheduler maintains a scheduled podQueue and listens to the APIServer.
  • When we create a Pod, we first write Pod metadata to etcd through the APIServer.
  • The scheduler listens to the Pod status through Informer. When a new Pod is added, the Pod is added to the podQueue.
  • The main process continuously extracts Pods from the podQueue and assigns nodes to Pods.
  • The scheduling process consists of two steps: Filter matching nodes and prioritize these nodes based on Pod configuration (for example, by metrics like resource usage and affinity) to score nodes and select the node with the highest score.
  • After a node is assigned successfully, invoke the binding pod interface of the apiServer and set pod.Spec.NodeName to the assigned pod.
  • The kubelet on the node also listens to the ApiServer. If it finds that a new Pod is scheduled to that node, the local dockerDaemon is invoked to run the container.
  • If the scheduler fails to schedule a Pod, if priority and preemption is enabled, first a preemption attempt is made, Pods with low priority on the node are deleted and Pods to be scheduled will be scheduled to the node. If the preemption is not enabled or the preemption attempt fails, related information will be recorded in logs and Pods will be added to the end of the podQueue.

On the Kubelet polling: Actually, the API server support a "watch" mode, which uses the WebSocket protocol. In this way the Kubelet is notified of any change to Pods with the Hostname equal to the hostname of the Kubelet.


Answering without a link to the source code, but I'm sure kubelet works like this:

Query Parameters...watch   Watch for changes to the described resources and return them as a stream of add, update, and remove notifications. Specify resourceVersion.

Watch functionality is inherited from etcd (the database behind the API Server): https://etcd.io/docs/v3.2.17/learning/api/. See Watch streams:

Watches are long running requests and use gRPC streams to stream event data.

So it's a kind of long polling.