on vCenter UI Cluster status, you can see errors like this:
Customized guest of Supervisor Control plane VM Configuration error (since 7/16/xxxx, 5:04:13 AM)
System error occurred on Master node with identifier xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx. Details: Log forwarding sync
update failed: Command '['/usr/bin/kubectl', ' -- kubeconfig", '/etc/kubernetes/admin.conf', 'get', 'configmap', 'fluentbit-config-
system', ' -- namespace', 'vmware-system-logging', ' -- ignore-not-found=true', '-o', 'json']' returned non-zero exit status 1.
or
Failed to delete RoleBinding [email protected] in namespace svc-contour-domain-c####. API server returned error 'rolebindings.rbac.authorization.k8s.io "wcp:svc-contour-domain-c####:user:vsphere.local:xxxxxxx" is forbidden: User "sso:[email protected]"' cannot delete resource "rolebindings" in API group "rbac.authorization.k8s.io" in the namespace "svc-contour-domain-c####". This operation will be retried.
Found space issue on 3 Supervisor Control Plane nodes root "/" partition:
# df -h | headFilesystem Size Used Avail Use% Mounted on/dev/root 32G 32G 0 100% /
devtmpfs 7.9G 0 7.9G 0% /dev
tmpfs 7.9G 212K 7.9G 1% /dev/shm
tmpfs 3.2G 10M 3.2G 1% /run
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
tmpfs 7.9G 14M 7.8G 1% /tmp
Clean up disk space
We have cleaned historical log files on /var/log/vmware with this commands:
- Check journal logs, and purge it:
journalctl --disk-usage
journalctl --vacuum-time=2d
- audit logs
deleted old files from /var/log/vmware/audit directory
- compress /var/log/vmware/upgrade-ctl-cli.log files
cd /var/log/vmware/
tar .czvf /var/log/vmware/upgrade-ctl-cli-bck.tar.gz upgrade-ctl-cli.log.?
Then delete the compressed files:
rm upgrade-ctl-cli.log.?
Run the steps on all 3 SV CP nodes.