Ansible playbook wait until all pods running
The kubectl wait
command
Kubernetes introduced the kubectl wait
in v1.11
version:
kubectl wait
is a new command that allows waiting for one or more resources to be deleted or to reach a specific condition. It adds akubectl wait --for=[delete|condition=condition-name]
resource/string command.
kubectl wait
now supports condition value checks other than true using--for condition=available=false
- Expanded
kubectl wait
to work with more types of selectors.kubectl wait
command now supports the--all
flag to select all resources in the namespace of the specified resource types.
It is not intended to wait for phases, but for conditions. I think that waiting for conditions is much more assertive than waiting for phases. See the following conditions:
- PodScheduled: the Pod has been scheduled to a node;
- Ready: the Pod is able to serve requests and should be added to the load balancing pools of all matching Services;
- Initialized: all init containers have started successfully;
- ContainersReady: all containers in the Pod are ready.
Using kubectl wait
with Ansible
Suppose that you are automating a Kubernetes install with kubeadm
+ Ansible, and need to wait for the installation to complete:
- name: Wait for all control-plane pods become created shell: "kubectl get po --namespace=kube-system --selector tier=control-plane --output=jsonpath='{.items[*].metadata.name}'" register: control_plane_pods_created until: item in control_plane_pods_created.stdout retries: 10 delay: 30 with_items: - etcd - kube-apiserver - kube-controller-manager - kube-scheduler- name: Wait for control-plane pods become ready shell: "kubectl wait --namespace=kube-system --for=condition=Ready pods --selector tier=control-plane --timeout=600s" register: control_plane_pods_ready- debug: var=control_plane_pods_ready.stdout_lines
Result Example:
TASK [Wait for all control-plane pods become created] ******************************FAILED - RETRYING: Wait all control-plane pods become created (10 retries left).FAILED - RETRYING: Wait all control-plane pods become created (9 retries left).FAILED - RETRYING: Wait all control-plane pods become created (8 retries left).changed: [localhost -> localhost] => (item=etcd)changed: [localhost -> localhost] => (item=kube-apiserver)changed: [localhost -> localhost] => (item=kube-controller-manager)changed: [localhost -> localhost] => (item=kube-scheduler)TASK [Wait for control-plane pods become ready] ********************************changed: [localhost -> localhost]TASK [debug] *******************************************************************ok: [localhost] => { "control_plane_pods_ready.stdout_lines": [ "pod/etcd-localhost.localdomain condition met", "pod/kube-apiserver-localhost.localdomain condition met", "pod/kube-controller-manager-localhost.localdomain condition met", "pod/kube-scheduler-localhost.localdomain condition met" ] }
I would try something like this (works for me):
tasks:- name: wait for pods to come up shell: kubectl get pods -o json register: kubectl_get_pods until: kubectl_get_pods.stdout|from_json|json_query('items[*].status.phase')|unique == ["Running"]
You are basically getting all the statuses for all the pods and combining them into a unique list, and then it won't complete until that list is ["Running"]
. So for example, if all your pods are not running you will get something like ["Running", "Starting"]
.
The community.kubernetes.k8s plugin for Ansible has a built in wait functionality !
However the problem with this is that different resources have different wait_condition
types. If you are using a deployment
then as seen below type: Complete
works well as long as you set the correct timeout bounds, but if you have different resource types in the yaml like serviceaccounts
it will most likely hang.
- name: Deploy the stack community.kubernetes.k8s: state: present src: "{{ dir }}my.yaml" wait: yes wait_sleep: 10 wait_timeout: 600 wait_condition: type: Complete status: "True"