kubectl get pods -n kube-system -l component=etcd -o name | xargs -i -- kubectl -n kube-system exec '{}' -- etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key endpoint status -w json
In the JSON output, compare the values of revision and dbSize. If the cluster is more than one hour old and these values differ by more than 10%, then the cluster may be affected.
TKG versions v1.5.0-1.5.3 use etcd versions v3.5.0-3.5.2, which has a known data inconsistency issue. The etcd issue is fixed TKG 1.5.4
Run the diagnosis under Symptoms above on all of your running clusters. If any of them seem to have been affected, then:
Ensuring that control plane nodes have ample memory can help mitigate this issue.
VMware recommends: