Kubectl Commands Failing with "etcdserver: leader changed"

Products

Tanzu Kubernetes Runtime

Issue/Introduction

While connected to a workload cluster through kubectl vsphere login or directly SSH into control plane nodes, kubectl commands are frequently failing with the below error message:

etcdserver: leader changed

Environment

vSphere Supervisor

This issue can occur regardless of whether or not the cluster is managed by Tanzu Mission Control (TMC)

Cause

ETCD functions in a quorum with one of its instances elected as leader.

Networking or resource issues in the affected cluster is causing ETCD to detect that the current leader is not responding within a desired timeframe and attempts to change the ETCD leader to resolve this. ETCD uses ports 2380 and 2379.

Changing the ETCD leader can lead to a brief moment where kubectl commands return the error "etcdserver: leader changed", but in an environment where the new ETCD leader is healthy, it is not expected for the issue to reoccur.

In this scenario, the affected cluster is experiencing networking or resource issues across all control plane nodes which in turn causes the ETCD leader to be considered unhealthy enough to continue to switch between quorum members as per the above.

Resolution

Connect into one of the affected control plane nodes in the cluster through SSH:

See SSH to TKG Service Cluster Nodes as the System User Using a Password

Check the ETCD logs of the ETCD containers in the affected control plane nodes.

crictl ps --name etcd

crictl logs <etcd container id>

If there are time-outs and slow responses logged, this may be an indication of high resource usage slowing down response times.

Check for high resource usage on the control plane node.

kubectl top pods --all-namespaces --sort-by=memory

kubectl top nodes --sort-by=memory

If the high resource usage is noted to be by kube-apiserver, this may be caused by a large number of resource objects and/or high number of requests being made to the kube-apiserver.

Check for a large count of kubernetes objects stored in the cluster. The below command searches for any object counts higher than 100:

kubectl get --raw=/metrics | grep apiserver_storage_objects | awk '$2>100' | sort -n -k 2

Large numbers of kubernetes objects can not only cause high resource issues, but also fill up the ETCD DB. By default, the ETCD is full at 2GB.

Use caution when cleaning up kubernetes objects. Reach out to VMware by Broadcom Technical Support referencing this KB for assistance.

Check for and clean up any pods in Error, Evicted or ContainerStatusUnknown state.

Kubernetes by default does not clean up any pods in the above states. This means that these failed pods can easily accumulate over time.

kubectl get pods -A -o wide | egrep -v "Run|Completed"

Additional Information

Known Issues

Kyverno is a third party service that is known to generate admissionreports (admr) in the cluster.
- However, if the kyverno clean-up job fails to delete the admissionreports, these admissionreports can build up to the hundred thousands, causing resource issues in the cluster.
- VMware by Broadcom is not responsible for third party services. We can assist with the clean-up but for issues with kyverno itself, contact the vendor accordingly.
Workload clusters managed by Tanzu Mission Control SaaS may encounter the below KB article:
- Tanzu Mission Control Clusters in Disconnected State due to Cluster-Auth-Pinniped-Impersonation-Proxy Tokens Accumulation