Managing Excessive ConfigMap("tanzu-featuregates-ctrl" and "tanzu-core-management-plugins-ctrl") Accumulation in tkg-system Namespace
search cancel

Managing Excessive ConfigMap("tanzu-featuregates-ctrl" and "tanzu-core-management-plugins-ctrl") Accumulation in tkg-system Namespace

book

Article ID: 378005

calendar_today

Updated On:

Products

Tanzu Kubernetes Grid VMware Tanzu Kubernetes Grid 1.x

Issue/Introduction

An excessive number of ConfigMap resources in the tkg-system namespace can lead to significant disk space consumption, resulting in slow disk errors in the etcd logs. This issue is indicated by warning messages such as:

{"level":"warn","ts":"2024-08-17T06:55:10.056412Z","caller":"etcdserver/raft.go:416","msg":"leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk","to":"5844d54fe5c10877","heartbeat-interval":"100ms","expected-duration":"200ms","exceeded-duration":"9.09639ms"}
 {"level":"warn","ts":"2024-09-04T06:45:12.456037Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2024-09-04T06:45:05.455285Z","time spent":"7.000747853s","remote":"127.0.0.1:51748","response type":"/etcdserverpb.KV/Txn","request count":0,"request size":0,"response count":0,"response size":0,"request content":""}
 {"level":"info","ts":"2024-09-04T06:45:12.522883Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"8fdaa22df8a9b8be [term: 27] ignored a MsgHeartbeatResp message with lower term from 5844d54fe5c10877 [term: 26]"}
 {"level":"info","ts":"2024-09-

Environment

Tanzu Kubernetes Grid 2.x

Cause

The accumulation of ConfigMap resources occurs due to the presence of numerous ConfigMaps("tanzu-featuregates-ctrl" and "tanzu-core-management-plugins-ctrl") and secrets in the system. You can verify the extent of this accumulation by executing the following command on the control plane node of the management cluster:

ssh capv@controlplane node ip addr

set alias  etcdctl="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/*/fs/usr/local/bin/etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt"


etcdctl get /registry --prefix=true --keys-only | grep -v ^$ | awk -F'/' '{ if ($3 ~ /cattle.io/) {h[$3"/"$4]++} else { h[$3]++ }} END { for(k in h) print h[k], k }' | sort -nr

This command may reveal a staggering number of ConfigMaps (e.g., 40,000) and secrets (e.g., 300).

Slow disk performance can lead to inconsistent leader elections between the two control plane nodes.

Resolution

To mitigate this issue, follow these steps:

  1. List ConfigMaps: Check which ConfigMaps are being recreated using the command:
    kubectl get cm -A
    

     

  2. Check if tanzu-featuregates-ctrl and tanzu-core-management-ctrl apps exist:
    kapp list -A

    If the apps' names don't include "-ctrl" appended to the name, i.e. "tanzu-featuregates.app" and "tanzu-core-management-plugins.app", it could mean the parent apps for the "tanzu-featuregates-ctrl" and "tanzu-core-management-plugins-ctrl" ConfigMaps are not present in the system anymore. This can be expected in scenarios where TKG version has been upgraded several times from old TKG releases.

    In these cases, please contact Tanzu Support to assist with the cleanup of "tanzu-featuregates-ctrl" and "tanzu-core-management-plugins-ctrl" ConfigMaps.

  3. Delete Excess ConfigMaps: Record the names of the ConfigMaps you wish to manage. To retain only the latest 10 ConfigMaps, execute:
    kapp app-change gc -a tanzu-featuregates-ctrl -n tkg-system --max 10
    

    This command will delete older objects for the "tanzu-featuregates-ctrl" ConfigMap.

    kapp app-change gc -a tanzu-core-management-plugins-ctrl -n tkg-system --max 10

    This command will delete older objects for the "tanzu-core-management-plugins-ctrl" ConfigMap.