kubectl
commands are not working, and the Kubernetes cluster is in a stalled state. The Kubernetes API server container is in an "exited" state, and attempts to connect to the KubeVIP are unsuccessful.
Tanzu Kubernetes Grid 2.x
A mismatch was identified between the IPs in the etcd.yaml file and the IPs assigned to the control plane VMs. The IPs originally assigned to the control plane nodes were changed, which caused a loss of the etcd quorum. This disruption led to the Kubernetes API server becoming unavailable, resulting in failed kubectl commands.
Error -- couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused
find /var -name etcdctl
/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/13/fs/usr/local/bin/etcdctl
etcdctl
alias etcdctl=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/13/fs/usr/local/bin/etcdctl
alias etcdctl="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/*/fs/usr/local/bin/etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt"
etcdctl member list
The o/p of "etcdctl member list" does not show any members and "crictl ps -a" command shows kubeapiserver container in exited status
We checked the cloud-init output log (cat /var/log/cloud-init-output.log
) to confirm the original IPs assigned to the control plane VMs. In this case, the initial IPs were 10.10.10.21, 10.10.10.22, and 10.10.10.23.
However, the output of kubectl get nodes -o wide
showed the IPs as 10.10.10.21, 10.10.10.24, and 10.10.10.25, and the etcd.yaml
file located at "/etc/kubernetes/manifests/etcd.yaml
" under the "--listen-peer-urls"
field reflected these same mismatched IPs.
The mismatch between the assigned IPs and the original IPs of the VMs caused the etcd quorum to be lost.
To resolve this, we reassigned the original IPs to the control plane VMs and updated the DHCP server to reflect the correct IP assignments.
Restoring the original IPs re-established the etcd quorum, bringing the Kubernetes API server back online and allowing kubectl
commands to function properly.