Several hosts within the cluster are experiencing problems because the RAM disk's /var partition has become full.

Products

VMware NSX

Issue/Introduction

Esxi Hosts showing installed failed in UI and the hosts become unresponsive and VMs loose network.

Following Error is seen on the NSX UI.

Failed to initiate installer: Error unlocking file /var/run/liveimgdb.pid: [Errno 28] No space left on device cause = Error unlocking file /var/run/liveimgdb.pid: [Errno 28] No space left on device Please refer to the log file for more details

Note : Given that /var is utilized by multiple operations within ESXi, a failure may be attributed to issues other than those related to the RAM disk.

Logs similar to below are seen on Esxi Host

On /var/run/log/nsx-syslog

2026-01-23T12:52:39.245Z Er(179) nestdb-server[2102534]: NSX 2102534 - [nsx@6876 comp="nsx-esx" subcomp="nsx-nestdb" s2comp="nsx-rpc" tid="2102534" level="ERROR" errorCode="RPC101"]Exception occurred in Observer::RemoteStatus for vmware.nsx.nestdb.NestDb/Modify: leveldb::DB::Write() failed: IO error: /var/lib/vmware/nsx/nestdb/db/MANIFEST-000002: No space left on device

On /var/run/log/vmkernel

vmkwarning: cpu96:53402323)WARNING: VisorFSRam: 199: Cannot extend visorfs file /var/run/liveimgdb.pid because its ramdisk (var) is full.
vmkwarning: cpu96:53402323)WARNING: VisorFSRam: 199: Cannot extend visorfs file /var/run/liveimgdb.pid because its ramdisk (var) is full.
vmkwarning: cpu280:53403097)WARNING: VisorFSRam: 199: Cannot extend visorfs file /var/run/vmware/nsx-datapath-ctrs.pid because its ramdisk (var) is full.
vmkwarning: cpu280:53403097)WARNING: VisorFSRam: 199: Cannot extend visorfs file /var/run/vmware/nsx-datapath-ctrs.pid because its ramdisk (var) is full.

ramdisk list shows var using high memory usage with “Free” at ~0%

ESXi_host> localcli system visorfs ramdisk list

Ramdisk Name    System  Include in Coredumps  Reserved   Maximum      Used       Peak Used  Free   Reserved Free  Maximum Inodes  Allocated Inodes  Used Inodes  Mount Point
--------------  ------  --------------------  ---------  -----------  ---------  ---------  -----  -------------  --------------  ----------------  -----------  -----------
root            true    true                  32768 KiB    32768 KiB  14156 KiB  14156 KiB   56 %           56 %           26624              9696         9683  /
etc             true    true                  28672 KiB    28672 KiB   2188 KiB   2280 KiB   92 %           92 %            4096              1120         1110  /etc
opt             true    true                      0 KiB    32768 KiB   1396 KiB   1396 KiB   95 %            0 %            8192              1312         1309  /opt
var             true    true                   5120 KiB    49152 KiB  49152 KiB  49152 KiB    0 %            0 %            8192              1088         1070  /var

Environment

NSX 4.2.1.0

Cause

This issue is triggered by a race condition between the two nestdb jumpstart scripts that execute during host boot. The first script is responsible for setting up the nestdb-specific ramdisk, and the second script initiates the nestdb service.

A failure state occurs because the second script starts before the first script has finished its operation.

The nestdb service is expected to write to a single nestdb location. Due to the race condition trigger, the ESXi host redirects the Nestdb file writes to 2 ramdisk locations, var-ramdisk and nestdb-ramdisk, instead of only the expected nestdb-ramdisk location.

Resolution

The nestdb race condition is resolved in version 4.2.1.1. The fix prevents the ESXi host from writing to both ramdisks simultaneously.

To permanently resolve the issue upgrade to 4.2.1.1 or higher.

Workaround:

Rebooting the Host is currently an option for the customer, but it is a disruptive method.

If you believe you have encountered the issue and if rebooting is not an option please open a support case for assistance.

Additional Information

"Unknown" status for NSX prepared host due to nestdb DOWN