In a Kubernetes cluster, is there a way to migrate etcd from external to internal?
As I can understand you have 3 etcd cluster members, external from Kubernetes cluster perspective. The expected outcome is to have all three members running on Kubernetes master nodes.There is some information left undisclosed, so I try to explain several possible options.
First of all, there are several reasonable ways to run etcd process to use as Kubernetes control-plane key-value storage:
- etcd run as static pod, having startup configuration in
/etc/kubernetes/manifests/etcd.yaml
file - etcd run as a system service defined in
/etc/systemd/system/etcd.service
or similar file - etcd run as a docker container configured using command line options. (this solution is not really safe, unless you can make the contaner restarted after failure or host reboot)
For experimental purposes, you can also run etcd:
- as a simple process in linux userspace
- as a stateful set in the kubernetes cluster
- as a etcd cluster managed by etcd-operator.
My personal recommendation is to have 5 members etcd cluster: 3 members runs as a static pods on 3 master kubernetes nodes and two more runs as static pods on external (Kubernetes cluster independent) hosts. In this case you will still have a quorum if you have at least one master node running or if you loose two external nodes by any reason.
There are at least two way to migrate etcd cluster from external instances to the Kubernetes cluster master nodes. It works in the opposite way too.
Migration
It's quite straighforward way to migrate the cluster. During this procedure members are turned off (one at a time), moved to another host and started again. Your cluster shouldn't have any problems while you still have quorum in the etcd cluster. My recommendation is to have at least 3 or better 5 nodes etcd cluster to make the migration safer. For bigger clusters it's may be more convenient to use the other solution from my second answer.
The process of moving etcd member to another IP address is described in the official documentation:
To migrate a member:
- Stop the member process.
- Copy the data directory of the now-idle member to the new machine.
- Update the peer URLs for the replaced member to reflect the new machine according to the runtime reconfiguration instructions.
- Start etcd on the new machine, using the same configuration and the copy of the data directory.
Now let's look closer on each step:
0.1 Ensure your etcd cluster is healthy and all members are in a good condition. I would recommend also checking the logs of all etcd members, just in case.
(To successfuly run the following commands please refer to step 3 for auth variables and aliases)
# last two commands only show you members specified by using --endpoints command line option# the following commands is suppose to run with root privileges because certificates are not accessible by regular usere2 cluster-healthe3 endpoint healthe3 endpoint status
0.2 Check each etcd member configuration and find out where etcd data-dir is located, then ensure that it will remain accessible after etcd process termination. In most cases it's located under /var/lib/etcd on the host machine and used directly or mounted as a volume to etcd pod or docker container.
0.3 Create a snapshot of each etcd cluster member, it's better don't use it, than don't have it.
1. Stop etcd member process.
If you use kubelet
to start etcd, as recommended here, move etcd.yaml
file out of /etc/kubernetes/manifests/
. Right after that etcd Pod will be terminated by kubelet
:
sudo mv /etc/kubernetes/manifests/etcd.yaml ~/sudo chmod 644 ~/etcd.yaml
In case if you start etcd process as a systemd service you can stop it using the following command:
sudo systemctl stop etcd-service-name.service
In case of docker container you can stop it using the following command:
docker ps -a docker stop <etcd_container_id>docker rm <etcd_container_id>
If you run the etcd process from the command line, you can kill it using the following command:
kill `pgrep etcd`
2. Copy the data directory of the now-idle member to the new machine.
Not much complexity here. Compact etcd data-dir to the file and copy it to the destination instance. I also recommend to copy etcd manifest or systemd service configuration if you plan to run etcd on the new instance in the same way.
tar -C /var/lib -czf etcd-member-name-data.tar.gz etcdtar -czf etcd-member-name-conf.tar.gz [etcd.yaml] [/etc/systemd/system/etcd.service] [/etc/kubernetes/manifests/etcd.conf ...]scp etcd-member-name-data.tar.gz destination_host:~/scp etcd-member-name-conf.tar.gz destination_host:~/
3. Update the peer URLs for the replaced member to reflect the new member IP address according to the runtime reconfiguration instructions.
There are two way to do it, by using etcd API
or by running etcdctl
utility.
That's how etcdctl
way may look like:
(replace etcd endpoints variables with the correct etcd cluster members ip addresses)
# all etcd cluster members should be specifiedexport ETCDSRV="--endpoints https://etcd.ip.addr.one:2379,https://etcd.ip.addr.two:2379,https://etcd.ip.addr.three:2379"#authentication parameters for v2 and v3 etcdctl APIsexport ETCDAUTH2="--ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key"export ETCDAUTH3="--cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/peer.crt --key=/etc/kubernetes/pki/etcd/peer.key"# etcdctl API v3 aliasalias e3="ETCDCTL_API=3 etcdctl $ETCDAUTH3 $ETCDSRV"# etcdctl API v2 aliasalias e2="ETCDCTL_API=2 etcdctl $ETCDAUTH2 $ETCDSRV"# list all etcd cluster members and their IDse2 member liste2 member update member_id http://new.etcd.member.ip:2380#ore3 member update member_id --peer-urls="https://new.etcd.member.ip:2380"
That's how etcd API
way may look like:
export CURL_ETCD_AUTH="--cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt"curl https://health.etcd.istance.ip:2379/v2/members/member_id -XPUT -H "Content-Type: application/json" -d '{"peerURLs":["http://new.etcd.member.ip:2380"]}' ${CURL_ETCD_AUTH}
4. Start etcd on the new machine, using the adjusted configuration and the copy of the data directory.
Unpack etcd data-dir on the new host:
tar -xzf etcd-member-name-data.tar.gz -C /var/lib/
Adjust etcd startup configuration according to your needs. At this point it's easy to select another way to run etcd. Depending on your choice prepare manifest or service definition file and replace there old ip address with new. E.g.:
sed -i 's/\/10.128.0.12:/\/10.128.0.99:/g' etcd.yaml
Now it's time to start etcd by moving etcd.yaml
to /etc/kubernetes/manifests/
, or by running the following command (if you run etcd
as a systemd
service)
sudo systemctl start etcd-service-name.service
5. Check updated etcd process logs and etcd cluster health to ensure that member is healthy.
To do that you can use the following commands:
$ e2 cluster-health$ kubectl logs etct_pod_name -n kube-system$ docker logs etcd_container_id 2>&1 | less$ journalctl -e -u etcd_service_name
The second solution I've mentioned in another answer is
Growing and then shrinking etcd cluster
The downside of this method is that etcd quorum size is temporary increased, and in case of several nodes failure, etcd cluster may break. To avoid it, you may want to remove one existing etcd cluster member before adding another one.
Here is the brief overview of the process:
- generate certificates for all additional members using etcd
ca.crt
andca.key
from existing etcd node folder (/etc/kubernetes/pki/etcd/
). - add new member to the cluster using
etcdctl
command - create etcd config for new member
- start new etcd member using new keys and config
- check cluster health
- repeat steps 2-5 until all required etcd nodes are added
- remove one exessive etcd cluster member using etcdctl command
- check cluster health
- repeat steps 7-8 until the desired size of etcd cluster is achieved
- Adjust all etcd.yaml files for all etcd cluster members.
- Adjust etcd endpoints in all kube-apiserver.yaml manifests
Another possible sequence:
- generate certificates for all additional members using etcd
ca.crt
andca.key
from existing etcd node folder (/etc/kubernetes/pki/etcd/
). - remove one etcd cluster member using etcdctl command
- add new member to the cluster using etcdctl command
- create etcd config for new member
- start new etcd member using new keys and config
- check cluster health
- repeat steps 2-6 until required etcd configuration is achieved
- Adjust all
etcd.yaml
files for all etcd cluster members. - Adjust etcd endpoints in all kube-apiserver.yaml manifests
How to generate certificates:
- using kubeadm command (manual)
- using cfssl tool (Kubernetes the hard way guide)
- using openssl (link1, link2)
Note: If you have etcd cluster, you likely have etcd-CA certificate somewhere. Consider to use it along with the etcd-CA key to generate certificates for all additional etcd members.
Note: In case you choose to generate certificates manually, usual Kubernetes certificates' parameters are:
- Signature Algorithm: sha256WithRSAEncryption
- Public Key Algorithm: rsaEncryption
- RSA Public-Key: (2048 bit)
- CA certs age: 10 years
- other certs age: 1 year
You can check the content of the certificates using the following command:
find /etc/kubernetes/pki/ -name *.crt | xargs -l bash -c 'echo $0 ; openssl x509 -in $0 -text -noout'
How to remove a member from the etcd cluster
(Please refer to my another answer, step 3, for variables and alias definitions)
e3 member listb67816d38b8e9d2, started, kube-ha-m3, https://10.128.0.12:2380, https://10.128.0.12:23793de72bd56f654b1c, started, kube-ha-m1, https://10.128.0.10:2380, https://10.128.0.10:2379ac98ece88e3519b5, started, kube-etcd2, https://10.128.0.14:2380, https://10.128.0.14:2379cfb0839e8cad4c8f, started, kube-ha-m2, https://10.128.0.11:2380, https://10.128.0.11:2379eb9b83c725146b96, started, kube-etcd1, https://10.128.0.13:2380, https://10.128.0.13:2379401a166c949e9584, started, kube-etcd3, https://10.128.0.15:2380, https://10.128.0.15:2379 # Let's remove this onee2 member remove 401a166c949e9584
The member will shutdown instantly. To prevent further attempt of joining the cluster, move/delete etcd.yaml from /etc/kubernetes/manifests/ or shutdown etcd service on the etcd member node
How to add a member to the etcd cluster
e3 member add kube-etcd3 --peer-urls="https://10.128.0.16:2380"
The output shows the parameters required to start the new etcd cluster member, e.g.:
ETCD_NAME="kube-etcd3"ETCD_INITIAL_CLUSTER="kube-ha-m3=https://10.128.0.15:2380,kube-ha-m1=https://10.128.0.10:2380,kube-etcd2=https://10.128.0.14:2380,kube-ha-m2=https://10.128.0.11:2380,kube-etcd1=https://10.128.0.13:2380,kube-etcd3=https://10.128.0.16:2380"ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.128.0.16:2380"ETCD_INITIAL_CLUSTER_STATE="existing"
Note: ETCD_INITIAL_CLUSTER
variable contains all existing etcd cluster members and also the new node. If you need to add several nodes it should be done one node at a time.
Note: All ETCD_INITIAL_*
variables and corresponded command line parameters only required for the first etcd Pod start. After successful addition of the node to the etcd cluster, these parameters are ignored and can be removed from startup configuration. All required information is stored in /var/lib/etcd
folder in etcd database file.
The default etcd.yaml
manifest could be generated using the following kubeadm commmand:
kubeadm init phase etcd local
It's better to move etcd.yaml
file from /etc/kubernetes/manifests/
somewhere to make adjustments.
Also delete content of the /var/lib/etcd
folder. It contains data of new etcd cluster, so it can't be used to add member to existing cluster.
Then it should be adjusted according to member add command output. (--advertise-client-urls, -initial-advertise-peer-urls, --initial-cluster, --initial-cluster-state, --listen-client-urls, --listen-peer-urls
) E.g.:
apiVersion: v1kind: Podmetadata: creationTimestamp: null labels: component: etcd tier: control-plane name: etcd namespace: kube-systemspec: containers: - command: - etcd - --advertise-client-urls=https://10.128.0.16:2379 - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --client-cert-auth=true - --data-dir=/var/lib/etcd - --initial-advertise-peer-urls=https://10.128.0.16:2380 - --initial-cluster=kube-ha-m3=https://10.128.0.15:2380,kube-ha-m1=https://10.128.0.10:2380,kube-etcd2=https://10.128.0.14:2380,kube-ha-m2=https://10.128.0.11:2380,kube-etcd1=https://10.128.0.13:2380,kube-etcd3=https://10.128.0.16:2380 - --initial-cluster-state=existing - --key-file=/etc/kubernetes/pki/etcd/server.key - --listen-client-urls=https://10.128.0.16:2379 - --listen-metrics-urls=http://127.0.0.1:2381 - --listen-peer-urls=https://10.128.0.16:2380 - --name=kube-etcd3 - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt - --peer-client-cert-auth=true - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt - --snapshot-count=10000 - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt image: k8s.gcr.io/etcd:3.3.10 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /health port: 2381 scheme: HTTP initialDelaySeconds: 15 timeoutSeconds: 15 name: etcd resources: {} volumeMounts: - mountPath: /var/lib/etcd name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priorityClassName: system-cluster-critical volumes: - hostPath: path: /etc/kubernetes/pki/etcd type: DirectoryOrCreate name: etcd-certs - hostPath: path: /var/lib/etcd type: DirectoryOrCreate name: etcd-data
After saving the file, kubelet will restart etcd pod. Check etcd container logs to ensure it is joined to the cluster.
How to check cluster health
$ e2 cluster-healthmember b67816d38b8e9d2 is healthy: got healthy result from https://10.128.0.15:2379member 3de72bd56f654b1c is healthy: got healthy result from https://10.128.0.10:2379member ac98ece88e3519b5 is healthy: got healthy result from https://10.128.0.14:2379member cfb0839e8cad4c8f is healthy: got healthy result from https://10.128.0.11:2379member eb9b83c725146b96 is healthy: got healthy result from https://10.128.0.13:2379cluster is healthy$ e2 member listb67816d38b8e9d2: name=kube-ha-m3 peerURLs=https://10.128.0.15:2380 clientURLs=https://10.128.0.15:2379 isLeader=true3de72bd56f654b1c: name=kube-ha-m1 peerURLs=https://10.128.0.10:2380 clientURLs=https://10.128.0.10:2379 isLeader=falseac98ece88e3519b5: name=kube-etcd2 peerURLs=https://10.128.0.14:2380 clientURLs=https://10.128.0.14:2379 isLeader=falsecfb0839e8cad4c8f: name=kube-ha-m2 peerURLs=https://10.128.0.11:2380 clientURLs=https://10.128.0.11:2379 isLeader=falseeb9b83c725146b96: name=kube-etcd1 peerURLs=https://10.128.0.13:2380 clientURLs=https://10.128.0.13:2379 isLeader=false$ e3 endpoint health# the output includes only etcd members that are specified in --endpoints cli option or corresponded environment variable. I've included only three out of five membershttps://10.128.0.13:2379 is healthy: successfully committed proposal: took = 2.310436mshttps://10.128.0.15:2379 is healthy: successfully committed proposal: took = 1.795723mshttps://10.128.0.14:2379 is healthy: successfully committed proposal: took = 2.41462ms$ e3 endpoint status# the output includes only etcd members that are specified in --endpoints cli option or corresponded environment variable. I've included only three out of five membershttps://10.128.0.13:2379 is healthy: successfully committed proposal: took = 2.531676mshttps://10.128.0.15:2379 is healthy: successfully committed proposal: took = 1.285312mshttps://10.128.0.14:2379 is healthy: successfully committed proposal: took = 2.266932ms
How to check etcl Pod logs without using kubectl?
If you run etcd member using kubelet only, you can check its log using the following command:
docker logs `docker ps -a | grep etcd | grep -v pause | awk '{print $1}' | head -n1` 2>&1 | less
Note: Usually, only one etcd Pod can be run on the same node at the same time, because it uses database in the host directory /var/lib/etcd/
and it cannot be shared between two pods. Also etcd Pod uses node network interface to communicate with the etcd cluster.
Of course, you can configure etcd Pod to use different host directory and use different host ports as a workaround, but the above command assumes that the only one etcd Pod is present on the node.