ETCD pods got CrashLoopBackOff with the following error message.
# kubectl -n kube-system logs etcd-${CLUSTER_NAME}-control-plane-xxxx-xxxxx
{"level":"fatal","ts":"2024-10-29T05:31:33.06993Z","caller":"etcdmain/etcd.go:204","msg":"discovery failed","error":"listen tcp ##.##.##.##:2380: bind: cannot assign requested address","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:204\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\tgo.etcd.io/etcd/server/v3/etcdmain/main.go:40\nmain.main\n\tgo.etcd.io/etcd/server/v3/main.go:31\nruntime.main\n\truntime/proc.go:267"}
If kubectl is unavailable, you can check the ETCD pod error message.
# SSH to the control-plane node
kubectl get nodes -owide
ssh capv@${CONTROL_PLANE_NODE_IPADDRESS}
sudo -i
# Check ETCD pod log
crictl logs $(crictl ps -q --name etcd)
Tanzu Kubernetes Grid
The control-plane node's IP address was unexpectedly changed from the original. ETCD doesn't support changing the IP address.
Revert the control-plane node IP address to the original. If it's under DHCP, review the DHCP configuration.
1. Check IP address mismatch
You can confirm the original IP address in all 3 control-plane nodes.
# Check the original IP address
sudo grep -E initial-advertise /etc/kubernetes/manifests/etcd.yaml
#> - --initial-advertise-peer-urls=https://<ORIGINAL_IP_ADDRESS>:2380
# Check the current IP address
ip addr show dev eth0
# If the current IP address and original IP address is not match, go to the next step.
2. Check DHCP server configuration
Fix the configuration to assign the original IP address to the control-plane nodes.
3. Reboot the control-plane nodes
After 5 or 8 minutes, ETCD pods are all UP.
If the situation is not resolved, please raise a new support case.
If TKG version is v2.5.x, please consider using Node IPAM solution, which can prevent ETCD failures caused by DHCP.