An excessive number of ConfigMap resources in the tkg-system namespace can lead to significant disk space consumption, resulting in slow disk errors in the etcd logs. This issue is indicated by warning messages such as:
{"level":"warn","ts":"2024-08-17T06:55:10.056412Z","caller":"etcdserver/raft.go:416","msg":"leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk","to":"5844d54fe5c10877","heartbeat-interval":"100ms","expected-duration":"200ms","exceeded-duration":"9.09639ms"}
{"level":"warn","ts":"2024-09-04T06:45:12.456037Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2024-09-04T06:45:05.455285Z","time spent":"7.000747853s","remote":"127.0.0.1:51748","response type":"/etcdserverpb.KV/Txn","request count":0,"request size":0,"response count":0,"response size":0,"request content":""}
{"level":"info","ts":"2024-09-04T06:45:12.522883Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8fdaa22df8a9b8be [term: 27] ignored a MsgHeartbeatResp message with lower term from 5844d54fe5c10877 [term: 26]"}
{"level":"info","ts":"2024-09-
Tanzu Kubernetes Grid 2.x
The accumulation of ConfigMap resources occurs due to the presence of numerous ConfigMaps("tanzu-featuregates-ctrl
" and "tanzu-core-management-plugins-ctrl
") and secrets in the system. You can verify the extent of this accumulation by executing the following command on the control plane node of the management cluster:
ssh capv@controlplane node ip addr
set alias etcdctl="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/*/fs/usr/local/bin/etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt"
etcdctl get /registry --prefix=true --keys-only | grep -v ^$ | awk -F'/' '{ if ($3 ~ /cattle.io/) {h[$3"/"$4]++} else { h[$3]++ }} END { for(k in h) print h[k], k }' | sort -nr
This command may reveal a staggering number of ConfigMaps (e.g., 40,000) and secrets (e.g., 300).
Slow disk performance can lead to inconsistent leader elections between the two control plane nodes.
To mitigate this issue, follow these steps:
kubectl get cm -A
kapp list -A
If the apps' names don't include "-ctrl"
appended to the name, i.e. "tanzu-featuregates.app
" and "tanzu-core-management-plugins.app
", it could mean the parent apps for the "tanzu-featuregates-ctrl"
and "tanzu-core-management-plugins-ctrl"
ConfigMaps are not present in the system anymore. This can be expected in scenarios where TKG version has been upgraded several times from old TKG releases.
In these cases, please contact Tanzu Support to assist with the cleanup of "tanzu-featuregates-ctrl"
and "tanzu-core-management-plugins-ctrl"
ConfigMaps.
kapp app-change gc -a tanzu-featuregates-ctrl -n tkg-system --max 10
This command will delete older objects for the "tanzu-featuregates-ctrl"
ConfigMap.
kapp app-change gc -a tanzu-core-management-plugins-ctrl -n tkg-system --max 10
This command will delete older objects for the "tanzu-core-management-plugins-ctrl"
ConfigMap.