The clusterAgent service may slowly fill up OSDATA with logs.
search cancel

The clusterAgent service may slowly fill up OSDATA with logs.


Article ID: 312111


Updated On:


VMware vSphere ESXi


This issue has a non-obvious cause, since symptoms can show up after months of the inventory being in a seemingly stable state. Customers may otherwise be unaware that they are affected until OSDATA fills up and workloads are affected.


The file "/var/run/log/clusterAgent.stderr" on ESX becomes extremely large, potentially causing OSDATA to run out of diskspace, locking up ESX services and VMs.


VMware vSphere ESXi 8.0.1
VMware vSphere ESXi 8.0.x
VMware vSphere ESXi 8.0.2


When it encounters certain rare error conditions, a component within the clusterAgent ESX service may start periodically writing to "/var/run/log/clusterAgent.stderr". Since this isn't expected, the file is not rotated or otherwise monitored. If the state persists over several months, the file can expand to several gigabytes in size, filling up the OSDATA partition.


VMware is aware of this issue and working to resolve this in a future release.

The affected file should be monitored and cleared if needed. This requires SSH access to hosts.
stat /var/run/log/clusterAgent.stderr
If the file is absent or less than 1MB, there is no need for concern. If it is large, it should be deleted periodically.
The following command can be used to clear out the file. It can be run either selectively on hosts where the problem is spotted, or unconditionally on all hosts.
LF=/var/run/log/clusterAgent.stderr ; test -f $LF && [ $(stat -c%s $LF) -gt 1000000 ] && (rm -f $LF ; /etc/init.d/clusterAgent restart)
If the problem is not detected, there will be no output. If the problem is detected, there will be messages indicating that the clusterAgent service has been restarted.