Kubectl commands are failing in the affected vSphere Kubernetes cluster context.
When connected to the Supervisor cluster context, the following symptoms are present:
kubectl describe cluster -n <affected cluster namespace> <affected cluster>
"Following machines are reporting unknown etcd member status"
When connected to the affected cluster context, the following symptoms are present:
kubectl logs -n kube-system <kube-apiserver pod name>
"etcdserver: mvcc: database space exceeded"
"alarm:NOSPACE"
When SSH to one of the affected cluster's control plane nodes, the following symptoms are present:
ls -ltrh /var/lib/etcd/member/snap
vSphere with Tanzu 7.0
vSphere with Tanzu 8.0
This can occur on a vSphere Kubernetes cluster regardless of whether or not it is managed by Tanzu Mission Control (TMC)
ETCD's keyspace data limit has been reached or exceeded.
The default ETCD database storage size limit is 2 GB.
Once this limit is reached or exceeded, ETCD will crash.
Kube-apiserver is reliant on ETCD being healthy.
Without kube-apiserver in a healthy state, kubectl commands will fail.
Please open a ticket to VMware by Broadcom Technical Support referencing this KB for assistance in cleaning up ETCD database and restoring it to operational state.
Once ETCD is operational again, the root cause of what is filling up ETCD database will need to be investigated.
Otherwise, this may happen again at a rate depending on how quickly the database is actively being filled.