I can't get through to Hadoop server from Hadoop client I can't get through to Hadoop server from Hadoop client kubernetes kubernetes

I can't get through to Hadoop server from Hadoop client


You can check my other answer. HDFS is not production ready in K8s yet (as of this writing)

The namenode gives the client the IP addresses of the datanodes and it knows those when they join the cluster as shown below:

datanodes

The issue in K8s is that you have to expose each data node as a service or external IP, but the namenode sees the datanodes with their pod IP addresses that are not available to the outside world. Also, HDFS doesn't provide a publish IP for each datanode config where you could force to use a service IP, so you'll have to do fancy custom networking or your client has to be inside the podCidr (Which kind of defeats the purpose of HDFS being a distributed filesystem).


If you need IP node where running pod, can usage ENV:

apiVersion: v1kind: Podmetadata:  name: get-host-ipspec:  containers:    - name: test-container      image: k8s.gcr.io/busybox      command: [ "sh", "-c"]      args:      - while true; do          printenv HOST_IP;        done;      env:        - name: HOST_IP          valueFrom:            fieldRef:              fieldPath: status.hostIP  restartPolicy: Never

API docs: PodStatus v1 core