How to achieve JobManager High Availability in a Kubernetes Flink Cluster?

kubernetes apache-flink

The official doc says that high availability for the job manager is to deal with cases where the job manager crashes. So there is only a single job manager needed but you want to handle the case where it goes down. On Kubernetes if it goes down then Kubernetes should detect this and automatically restart it. So you don't need to run more replicas of it.

(The doc says this explicitly about using yarn for ha. It doesn't seem to state it for Kubernetes but restarting failing Pods is standard behaviour for Kubernetes.)

The task manager is configured by default to run with multiple replicas on Kubernetes in the official k8s resources (see the 'replicas' entries in the resources) but the job manager is not. (And it's the same in the helm chart.) So I believe it is not needed for the job manager - I'd suggest running with a single job-manager unless you hit specific problems with that.

kubernetes apache-flink

I agree with Ryan's answer regarding the fact that Kubernetes fulfils most of the requirements for HA deployments.

With Kubernetes, you can run a Flink job cluster with file-based HA, instead of using ZooKeeper or its equivalents.

You can find an example of how to set it up correctly (with or without HA) in this github repo.

One might also find useful those blogposts explaining about correctly deploying a flink job cluster on k8s and achieving file based high availability without zookeeper.

(Disclosure: I wrote the posts and set up the repo)

CodeHunter

How to achieve JobManager High Availability in a Kubernetes Flink Cluster?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last