how is cluster IP in kubernetes-aws configured? how is cluster IP in kubernetes-aws configured? kubernetes kubernetes

how is cluster IP in kubernetes-aws configured?


Mark Betz (SRE at Olark) presents Kubernetes networking in three articles:

For a pod, you are looking at:

pod network

You find:

  • etho0: a "physical network interface"
  • docker0/cbr0: a bridge for connecting two ethernet segments no matter their protocol.
  • veth0, 1, 2: Virtual Network Interface, one per container.
    docker0 is the default Gateway of veth0. It is named cbr0 for "custom bridge".
    Kubernetes starts containers by sharing the same veth0, which means each container must expose different ports.
  • pause: a special container started in "pause", to detect SIGTERM sent to a pod, and forward it to the containers.
  • node: an host
  • cluster: a group of nodes
  • router/gateway

The last element is where things start to be more complex:

Kubernetes assigns an overall address space for the bridges on each node, and then assigns the bridges addresses within that space, based on the node the bridge is built on.
Secondly, it adds routing rules to the gateway at 10.100.0.1 telling it how packets destined for each bridge should be routed, i.e. which node’s eth0 the bridge can be reached through.

Such a combination of virtual network interfaces, bridges, and routing rules is usually called an overlay network.

When a pod contacts another pod, it goes through a service.
Why?

Pod networking in a cluster is neat stuff, but by itself it is insufficient to enable the creation of durable systems. That’s because pods in Kubernetes are ephemeral.
You can use a pod IP address as an endpoint but there is no guarantee that the address won’t change the next time the pod is recreated, which might happen for any number of reasons.

That means: you need a reverse-proxy/dynamic load-balancer. And it better be resilient.

A service is a type of kubernetes resource that causes a proxy to be configured to forward requests to a set of pods.
The set of pods that will receive traffic is determined by the selector, which matches labels assigned to the pods when they were created

That service uses its own network. By default, its type is "ClusterIP"; it has its own IP.

Here is the communication path between two pods:

two pods network

It uses a kube-proxy.
This proxy uses itself a netfilter.

netfilter is a rules-based packet processing engine.
It runs in kernel space and gets a look at every packet at various points in its life cycle.
It matches packets against rules and when it finds a rule that matches it takes the specified action.
Among the many actions it can take is redirecting the packet to another destination.

kube-proxy and netfilter

In this mode, kube-proxy:

  • opens a port (10400 in the example above) on the local host interface to listen for requests to the test-service,
  • inserts netfilter rules to reroute packets destined for the service IP to its own port, and
  • forwards those requests to a pod on port 8080.

That is how a request to 10.3.241.152:80 magically becomes a request to 10.0.2.2:8080.
Given the capabilities of netfilter all that’s required to make this all work for any service is for kube-proxy to open a port and insert the correct netfilter rules for that service, which it does in response to notifications from the master api server of changes in the cluster.

But:

There’s one more little twist to the tale.
I mentioned above that user space proxying is expensive due to marshaling packets. In kubernetes 1.2, kube-proxy gained the ability to run in iptables mode.

In this mode, kube-proxy mostly ceases to be a proxy for inter-cluster connections, and instead delegates to netfilter the work of detecting packets bound for service IPs and redirecting them to pods, all of which happens in kernel space.
In this mode kube-proxy’s job is more or less limited to keeping netfilter rules in sync.

The network schema becomes:

netfilter in action

However, this is not a good fit for external (public facing) communication, which needs an external fixed IP.

You have dedicated services for that: nodePort and LoadBalancer:

A service of type NodePort is a ClusterIP service with an additional capability: it is reachable at the IP address of the node as well as at the assigned cluster IP on the services network.
The way this is accomplished is pretty straightforward:

When kubernetes creates a NodePort service, kube-proxy allocates a port in the range 30000–32767 and opens this port on the eth0 interface of every node (thus the name “NodePort”).

Connections to this port are forwarded to the service’s cluster IP.

You get:

load-balancer / nodeport

A Loadalancer is more advancer, and allows to expose services using stand ports.
See the mapping here:

$ kubectl get svc service-testNAME      CLUSTER-IP      EXTERNAL-IP     PORT(S)          AGEopenvpn   10.3.241.52     35.184.97.156   80:32213/TCP     5m

However:

Services of type LoadBalancer have some limitations.

  • You cannot configure the lb to terminate https traffic.
  • You can’t do virtual hosts or path-based routing, so you can’t use a single load balancer to proxy to multiple services in any practically useful way.

These limitations led to the addition in version 1.2 of a separate kubernetes resource for configuring load balancers, called an Ingress.

The Ingress API supports TLS termination, virtual hosts, and path-based routing. It can easily set up a load balancer to handle multiple backend services.
The implementation follows a basic kubernetes pattern: a resource type and a controller to manage that type.
The resource in this case is an Ingress, which comprises a request for networking resources

For instance:

apiVersion: extensions/v1beta1kind: Ingressmetadata:  name: test-ingress  annotations:    kubernetes.io/ingress.class: "gce"spec:  tls:    - secretName: my-ssl-secret  rules:  - host: testhost.com    http:      paths:      - path: /*        backend:          serviceName: service-test          servicePort: 80

The ingress controller is responsible for satisfying this request by driving resources in the environment to the necessary state.
When using an Ingress you create your services as type NodePort and let the ingress controller figure out how to get traffic to the nodes.

There are ingress controller implementations for GCE load balancers, AWS elastic load balancers, and for popular proxies such as NGiNX and HAproxy.