Kafka in Kubernetes Cluster- How to publish/consume messages from outside of Kubernetes Cluster Kafka in Kubernetes Cluster- How to publish/consume messages from outside of Kubernetes Cluster docker docker

Kafka in Kubernetes Cluster- How to publish/consume messages from outside of Kubernetes Cluster


I had the same problem with accessing kafka from outside of k8s cluster on AWS. I manage to solve this issue by using kafka listeners feature which from version 0.10.2 supports multiple interfaces.

here is how I configured kafka container.

    ports:    - containerPort: 9092    - containerPort: 9093    env:    - name: KAFKA_ZOOKEEPER_CONNECT      value: "zookeeper:2181"    - name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP      value: "INTERNAL_PLAINTEXT:PLAINTEXT,EXTERNAL_PLAINTEXT:PLAINTEXT"    - name: KAFKA_ADVERTISED_LISTENERS      value: "INTERNAL_PLAINTEXT://kafka-internal-service:9092,EXTERNAL_PLAINTEXT://123.us-east-2.elb.amazonaws.com:9093"    - name: KAFKA_LISTENERS      value: "INTERNAL_PLAINTEXT://0.0.0.0:9092,EXTERNAL_PLAINTEXT://0.0.0.0:9093"    - name: KAFKA_INTER_BROKER_LISTENER_NAME      value: "INTERNAL_PLAINTEXT"

Apart from that I configured two Services. One for internal(Headless) & one for external(LoadBalancer) communication.

Hopefully this will save people's time.


I was able to solve my problem by doing the following changes -

  1. Using NodeSelector in YML to make kafka pod run on a particular node of kube cluster.

  2. Set KAFKA_ADVERTISED_HOST_NAME to Kube hostName where this Kafka POD has been configured to run on ( as configured in step 1 )

  3. Expose Kafka Service using NodePort and set POD port same as that of exposed NodePort as shown below -

    spec:  ports:    - name: broker-2      port: **30031**      targetPort: 9092      nodePort: **30031**      protocol: TCP  selector:    app: kafka-2    broker_id: "2"  type: NodePort

Now, you can access Kafka brokers from outside of kube cluster using host:exposedPort


I solved this problem by using Confluent's Kafka REST proxy image.

https://hub.docker.com/r/confluentinc/cp-kafka-rest/

Documentation of the REST Proxy is here:

http://docs.confluent.io/3.1.2/kafka-rest/docs/index.html

Step A: Build a Kafka broker docker image using latest Kafka version

I used a custom built Kafka broker image based on the same image you used. You basically just need to update cloudtrackinc's image to use Kafka version 0.10.1.0 or otherwise it won't work. Just update the Dockerfile from cloudertrackinc's image to use the latest wurstmeister kafka image and rebuild the docker image.

- FROM wurstmeister/kafka:0.10.1.0

I set the ADVERTISED_HOST_NAME for each Kafka broker to POD's IP so each broker gets an unique URL.

- name: ADVERTISED_HOST_NAME  valueFrom:    fieldRef:      fieldPath: status.podIP

Step B: Setup cp-kafka-rest proxy to use your Kafka broker cluster

Kafka Rest Proxy must be running within the same cluster as your Kafka broker cluster.

You need to provide two environment variables to the cp-kafka-rest image at the minimum for it to run. KAFKA_REST_HOST_NAME and KAFKA_REST_ZOOKEEPER_CONNECT. You can set KAFKA_REST_HOST_NAME to use POD's IP.

- name: KAFKA_REST_HOST_NAME  valueFrom:    fieldRef:      fieldPath: status.podIP- name: KAFKA_REST_ZOOKEEPER_CONNECT  value: "zookeeper-svc-1:2181,zookeeper-svc-2:2181,zookeeper-svc-3:2181"

Step C: Expose the Kafka REST proxy as a service

spec: type: NodePort or LoadBalancer ports: - name: kafka-rest-port port: 8082 protocol: TCP

You can use NodePort or LoadBalancer to utilize single or multiple Kafka REST Proxy pods.

Pros and Cons of using Kafka REST proxy

Pros:

  1. You can scale Kafka broker cluster easily
  2. You do not have to expose Kakfa brokers outside of the cluster
  3. You can use the loadbalancer with the Proxy.
  4. You can use any type of client to access the Kafka cluster (i.e. curl). Very light weight.

Cons:

  1. Another component/layer on top of the Kakfa cluster.
  2. Consumers are created within the proxy pod. This will need to be keep track of by your REST client.
  3. Performance is not ideal: REST instead of native Kafka protocol. Although if you deploy multiple proxies this might help a little bit. I would not use this setup for high volume traffic. For low volume message traffic this might be okay.

So if you can live with the issues above, then give Kafka Rest Proxy a try.