/storage/seat partition is getting filled at a faster ratevxpd service goes down/var/lib/avi/log/cc* | zgrep "Processing Key: VirtualMachine" shows lots of VM updates per minute
var/lib/avi/log/cc_agent_go_Default-Cloud* | xargs -I {} sh -c 'echo {};zgrep "Processing Key: VirtualMachine" "{}" | awk "{print substr(\$0, index(\$0, \"YYYY-MM-DDTHH:MM:SS\"), 19)}" | sort | uniq -c | awk "\$1 > 200"' shows lots of VM updates/var/log/vmware/vpxd/vpxd-profiler-**.log:--> /MoRegistryStats/Class='11DatastoreMo'/ERProviderMixin/Overflows/total ##### shows high number for ERProviderMixin/
var/log/vmware/vpxd/vpxd-profiler-**.log--> /ActivationStats/Task/Actv='vim.AuthorizationManager.setEntityPermissions'/TotalTime/numSamples ###### shows high number of setEntityPermissions
/var/log/vmware/vpxd/vpxd.log shows YYYY-MM-DDTHH:MM:SS.608Z info vpxd[2487065] [Originator@6876 sub=vpxLro opID=wcp-699e98e7-2cdc05d1-801a-47de-8b4f-af387dd42b83-65] [VpxLRO] -- BEGIN lro-489843407 -- AuthorizationManager -- vim.AuthorizationManager.setEntityPermissions -- 52105ecb-cbfd-0bae-0676-3991b0c6630f(52259e96-7206-4596-4510-4002fff8d71b)
wcpsvc.log located at /var/log/vmware/wcp showswcpsvc-YYYY-MM-DDTHH:MM:SS.570.log:2YYYY-MM-DDTHH:MM:SS. debug wcp [vclib/authz.go:54] [opID=699e9708] Successfully set permissions [{{} <nil> wcp-storage-user-############@vsphere.local false 1090 true}] on entity Datastore:datastore-######wcpsvc-YYYY-MM-DDTHH:MM:SS.967Z debug wcp [nodechecker/node_check.go:70] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] wcp-sv-state-checker log output when running checks '[ConnectToLoadBalancer]' on VM VirtualMachine:vm-####. stdout: {"ConnectToLoadBalancer":{"id":"ConnectToLoadBalancer","status":"SetupFailure","time_completed":"YYYY-MM-DDTHH:MM:SS.563961415Z","conditions":[{"type":"SetupFailure","message":{"severity":"ERROR","details":{"id":"vcenter.wcp.node_state_check.setup_failure","default_message":"An internal error on the control plane VM (420137a0706d0fda0b17bc4e903996ac) prevented the check ConnectToLoadBalancer from completing successfully. Error: Unable to fetch valid load balancer configs. Err: Unauthorized.","args":["420137a0706d0fda0b17bc4e903996ac","ConnectToLoadBalancer","Unable to fetch valid load balancer configs. Err: Unauthorized"]}}}],"description":{"id":"wcp.healthcheck.connect_to_loadbalancer.description","default_message":"Checks to see if the Control Plane VM is able to connect to any configured load balancers. This check can only be run in a VDS environment, after the Kubernetes API Server is up.","args":null}}}, stderr: time="wcpsvc-YYYY-MM-DDTHH:MM:SS" level=info msg="Running checks: [ConnectToLoadBalancer]"time="wcpsvc-YYYY-MM-DDTHH:MM:SS" level=debug msg="Parsed node configuration: &{HostName:######### VCenterPNID:<fqdn>:443 ManagementNetwork:{DNSFromDHCP:false DNSServers:[###.##.##.#3] DNSSearchDomains:[<name>]} WorkloadNetwork:{IPAddress:##.##.##.## DNSServers:[###.##.##.##]} KubernetesConfig:{CertificateAuthority:0xc######### InitialAPIServer:https://##.##.##.##:6443} NSXManagerConfig:[] NetworkProvider:1 LoadBalancerProvider:HA_PROXY}"time="wcpsvc-YYYY-MM-DDTHH:MM:SS" level=info msg="Check [ConnectToLoadBalancer] running"time="wcpsvc-YYYY-MM-DDTHH:MM:SS" level=info msg="Attempting to connect to the Kubernetes Server, using configuration file path: '/etc/kubernetes/admin.conf'" check=ConnectToLoadBalancertime="wcpsvc-YYYY-MM-DDTHH:MM:SS" level=info msg="Fetching all loadBalancerConfigs..." check=ConnectToLoadBalancertime="wcpsvc-YYYY-MM-DDTHH:MM:SS" level=error msg="Failed to fetch haProxy list. Err: Unauthorized" check=ConnectToLoadBalancertime="wcpsvc-YYYY-MM-DDTHH:MM:SS" level=error msg="Failed to fetch load balancer config info. Err: Unable to fetch valid load balancer configs. Err: Unauthorized" check=ConnectToLoadBalancertime="wcpsvc-YYYY-MM-DDTHH:MM:SS" level=info msg="Check [ConnectToLoadBalancer] completed. Result: SetupFailure"wcpsvc-YYYY-MM-DDTHH:MM:SS debug wcp [nodechecker/node_check.go:95] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Check 'ConnectToLoadBalancer' was unsuccessful on node VirtualMachine:vm-####. Status: SetupFailurewcpsvc-YYYY-MM-DDTHH:MM:SS error wcp [kubelifecycle/load_balancer.go:68] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Unable to verify load balancer connection from nodes. node checks on control plane VM VirtualMachine:vm-#### failed for indeterminate reasonswcpsvc-YYYY-MM-DDTHH:MM:SS info wcp [kubelifecycle/load_balancer.go:69] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Reconcile load balancer exitedwcpsvc-YYYY-MM-DDTHH:MM:SS debug wcp [kubelifecycle/controller.go:506] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Supervisor configuration retry.
wcpsvc.log located at /var/log/vmware/wcp/ shows "No space left on device"YYYY-MM-DDTHH:MM:SS error wcp [vclib/guestop.go:338] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Kubenode guest command failed. RC: 1, Out: , Err: INFO:__main__:Loaded 1 key from /dev/shm/secretTraceback (most recent call last):........encrypted.write(encrypt(plain_bytes, key))OSError: [Errno 28] No space left on deviceYYYY-MM-DDTHH:MM:SS error wcp [kubelifecycle/master_node.go:704] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Failed to encrypt desired node configuration. Err Guest operation failed for the Master node VM with identifier vm-####., stdout: , stderr: INFO:__main__:Loaded 1 key from /dev/shm/secretTraceback (most recent call last):File "/usr/lib/vmware-wcp/hypercrypt.py", line 292, in <module>.........encrypted.write(encrypt(plain_bytes, key))OSError: [Errno 28] No space left on deviceYYYY-MM-DDTHH:MM:SS error wcp [kubelifecycle/controller.go:2062] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Failed to update desired config of MasterNode VirtualMachine:vm-####. Err: Guest operation failed for the Master node VM with identifier vm-####.YYYY-MM-DDTHH:MM:SS error wcp [kubelifecycle/controller.go:2231] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Error configuring API server on cluster 2cdc05d1-801a-47de-8b4f-af387dd42b83 Guest operation failed for the Master node VM with identifier vm-####.YYYY-MM-DDTHH:MM:SS warning wcp [kubelifecycle/controller.go:1014] [opID=699eb397-2cdc05d1-801a-47de-8b4f-af387dd42b83] Unable to configure agent in cluster domain-c8. Err Guest operation failed for the Master node VM with identifier vm-####.
vm-#### is the VM ID of a Supervisor Control Plane VM and upon checking shows its partition is full<user>@NodeID [ ~ ]# df -h /dev/rootFilesystem Size Used Avail Use% Mounted on/dev/root 32G 32G 0 100% /Supervisor Control Plane VM has no space left on device
To resolve this issue, follow the steps from KB : vSphere Supervisor Disk Space Clean Up Scripts
To workaround this issue of vCenter SEAT partition getting full, Configure Event Retention in vCenter for 7 days using below steps.
1. Open the vSphere Client and log in to vCenter Server.
2. Navigate to Administration → vCenter Server Settings
Go to the Database Retention Policy section.
Below options are availableTask retentionEvent retention
3. Set the desired number of days.Event retention: 7 days
4. Click OK or Save to apply the changes.
NOTE: Running du -sh /* and re running the command on the directory consuming the highest disk shows the actual consumer
Increasing the disk space for the vCenter Server Appliance in vSphere 6.5, 6.7, 7.0 and 8.0