Pod stays in pending state due to failed scheduling Pod stays in pending state due to failed scheduling kubernetes kubernetes

Pod stays in pending state due to failed scheduling


But the pod stays in the Pending state.

The pod has an event listed that it can't be scheduled because there are no nodes available.

This is as expected if you have reached your capacity. You can check the capacity of any node with:

kubectl describe node <node_name>

And to get a node name, use:

kubectl get nodes

To mitigate this, use more nodes, or fewer pods or configure so that the cluster can autoscale when this happens.


Things to try :

  • Remove all pods created by the job. Using kubectl delete --all pods --namespace=foo you can delete all the pods in the specified namespace. Also, maybe remove the job that evidently lacks configuration. The job can be configured to stop spawning pods after a defined number of failure or success. Check backOffLimit and restartPolicy in the kubernetes job documentation.

  • Check taints and tolerations. Describe your nodes with kubectl describe node <node_name> and the Taints: section inside. If there is some taints you will have to reflect this in the tolerations of your job. Also check for "memoryPressure" or things like that. It will also be listed in the node description.

  • Check available ressources with kubectl top nodes. Check available RAM and CPU.

  • Check if container image can be pulled. Maybe pull it on Docker and make sure it works and it's not giving you a timeout.

  • Check all networkPolicies with kubectl get netpol -A to ensure that any policies would block communication with pods in kube-system.

  • You can also check RBAC configuration but that would be far-fetched


What you can do first is check your cluster autoscaler, RBAC and also the role attached to your nodes(you might me missing some permissions there probabily.)For the crons check restartPolicy .