Kubernetes too old resource version Kubernetes too old resource version kubernetes kubernetes

Kubernetes too old resource version


I'm from Fabric8 Kubernetes Client team. I think it's standard behavior of Kubernetes to give 410 after some time during watch. It's usually client's responsibility to handle it. In the context of a watch, it will return HTTP_GONE when you ask to see changes for a resourceVersion that is too old - i.e. when it can no longer tell you what has changed since that version, since too many things have changed. In that case, you'll need to start again, by not specifying a resourceVersion in which case the watch will send you the current state of the thing you are watching and then send updates from that point.

Fabric8 does not handle it with plain watch. But it is handling it in SharedInformer API, see ReflectorWatcher. I would recommend using informer API when writing operators since it's better than plain list and watch. Here is a simple example of Using SharedInformer API:

try (KubernetesClient client = new DefaultKubernetesClient()) {  SharedInformerFactory sharedInformerFactory = client.informers();  SharedIndexInformer<Pod> podInformer = sharedInformerFactory.sharedIndexInformerFor(Pod.class, PodList.class, 30 * 1000L);  podInformer.addEventHandler(new ResourceEventHandler<Pod>() {    @Override    public void onAdd(Pod pod) {      // Handle Creation    }    @Override    public void onUpdate(Pod oldPod, Pod newPod) {      // Handle update    }    @Override    public void onDelete(Pod pod, boolean deletedFinalStateUnknown) {      // Handle deletion    }  });  sharedInformerFactory.startAllRegisteredInformers();}

You can find a full demo of a simple operator using Fabric8 SharedInformer API here: PodSet Operator In Java


this workaround worked for me, I hope it will help othersEvery time, my pod got this “too old resource” error it halted and restarted itself.I have found out that if I am creating the resources manually (in case it was CRD – even a dummy one)There are almost no “too old resource” exceptions so the operator was up and running and listening.So, what I have done:

  1. At the moment this specific error is happening:a. System error (which will restart the pod)b. Exception with text "too old resource version"
  2. Created new dummy CRD object at the platform (before the restart of the pod)a. Programmatically (fabric8), Check if dummy CRD exists. If so, delete it.b. Programmatically (fabric8), create the dummy CRD again with
  3. Then the pod restarted itself (this restart also happened before my code changes it is not because of my code)
  4. When the pod starts up it creates secret out of the dummy CRD.

From that point there were almost no restarts and the operator was up and running and listening.Just don’t forget to give permissions to the operator’s service account to create and delete those resources.