Symptoms:
Due to a bug in ConfigStore API, stale data related to block devices might not be deleted in time from the ESXi ConfigStore database and causing an out of space condition. As a result, write operations to ConfigStore start to fail. In the backtrace, you see logs such as:
2022-12-19T03:51:42.733Z cpu53:26745174)WARNING: VisorFSRam: 203: Cannot extend visorfs file /etc/vmware/configstore/current-store-1-journal because its ramdisk (configstore) is full.
Below are the symptoms observed in the vSphere 8.0 U3 versions
Follow the steps under the resolution section for vSphere 8.0 Update 3 mentioned in this KB
1. In the /var/log/vobd.log file you see below entries
- When configstore database is 80% disk space:
vobd[1000079502]: [VisorfsCorrelator] 249872232435us: [vob.visorfs.ramdisk.usage.warning] Ramdisk 'configstore' usage is very high. Approx 20% space left.
vobd[1000079502]: [VisorfsCorrelator] 249870515527us: [esx.problem.visorfs.configstore.usage.warning] Ramdisk 'configstore' usage is very high. Approx 20% space left. Please refer to the KB 93362 for more details.
vobd[1000079502]: [VisorfsCorrelator] 249872241130us: [vob.visorfs.ramdisk.usage.warning] Ramdisk 'configstore' usage is very high. Approx 20% space left.
- When configstore database is about to reach full disk space:
vobd[1000079502]: [VisorfsCorrelator] 249873490366us: [vob.visorfs.ramdisk.usage.error] Ramdisk 'configstore' is reaching its critical size limit. Approx 10% space left.
vobd[1000079502]: [VisorfsCorrelator] 249871773438us: [esx.problem.visorfs.configstore.usage.error] Ramdisk 'configstore' is reaching its critical size limit. Approx 10% space left. Please refer to the KB 93362 for more details.
vobd[1000079502]: [VisorfsCorrelator] 249873497673us: [vob.visorfs.ramdisk.usage.error] Ramdisk 'configstore' is reaching its critical size limit. Approx 10% space left.
2. In vCenter Server an events similar to
- A warning will get display when configstore database is 80% disk space:
Ramdisk 'configstore' usage is very high. Approx 20% space left. Please refer to the KB 93362 for more details.
- An error will get display when configstore database is about to reach full disk space:
Ramdisk 'configstore' is reaching its critical size limit. Approx 10% space left. Please refer to the KB 93362 for more details.
VMware vSphere ESXi 8.0.1
VMware vSphere ESXi 8.0.0
VMware vSphere ESXi 7.0.3
Follow the below mentioned steps.
Option 1:
Use configstore-recovery python script attached to the KB article. Copy the script to the host and run python configstore-recovery
The script performs following steps:
1. Temporarily increase the configstore ramdisk size to 64MB (initial size is 32MB)
2. Clean stale/empty data from the configstore DB.
3. Perform VACUUM on the configstore DB. The VACUUM command rebuilds the database file, repacking it into a minimal amount of disk space.
4. Revert configstore ramdisk size to 32MB
Logs from the script are captured in /var/run/log/syslog.log
Option 2:
Manually recover the host by following the below steps:
1. Temporarily increase the configstore ramdisk size to 64MB (initial size is 32MB).
a. Get configstore ramdisk group ID using:
vsish -e set /sched/groupPathNameToID host system visorfs ramdisks configstore
b. Set configstore ramdisk max memory to 64 using:
vsish -e set /sched/groups/<GID>/memAllocationInMB max=64
c. Verify configstore ramdisk memory allocation using:
vsish -e get /sched/groups/<GID>/memAllocationInMB
Example:
[root@hostname:~] vsish -e set /sched/groupPathNameToID host system visorfs ramdisks configstore
1627
[root@hostname:~]
[root@hostname:~] vsish -e get /sched/groups/1627/memAllocationInMB
memsched-allocation {
min:32
max:32
shares:-3
minLimit:-1
units: 4 -> mb
}
[root@hostname:~] vsish -e set /sched/groups/1627/memAllocationInMB max=64
[root@hostname:~]
[root@hostname:~] vsish -e get /sched/groups/1627/memAllocationInMB
memsched-allocation {
min:32
max:64
shares:-3
minLimit:-1
units: 4 -> mb
}
[root@hostname:~]
2. Forcefully purge any stale device entries currently on the host. This gives a chance to purge any recently unmapped devices (< 7days) to get purged
esxcli storage core device purge -f
3. Delete 'esx/storage/devices_access' configuration using configstorecli
configstorecli config current delete -c esx -g storage -k devices_access --all
4. Reboot the host (do not force reboot).
reboot
Fix in vSphere 8.0 Update 3 :-
To resolve this issue on 8.0 Update 3 release run configstore-recovery tool which is available by default.
[root@hostname:~] /usr/lib/vmware/configmanager/tools/configstore-recovery --recover