Edge Transport node is in 'Failed' state in NSX UI
Communication issues with vm's residing inside NSX
VMware NSX T 4.1.2
Multiple instances of NestDB are started. This causes unpredictable behavior from the perspective of the NestDB clients, as some clients operate on one instance while other clients operate on another.
The NestDB server startup script, like many other LCP daemons, uses pidof to determine if the process has been started. If it does not detect that the process has started, the startup script launches another instance of the watchdog, which in turn attempts to launch another instance of NestDB.
This works fine under normal circumstances, but pidof does *not* return processes that are in the uninterruptible sleep state (D) or the zombie state (Z) by default on some linux distributions, including Ubuntu 20.04 (Ubuntu version on this Edge VM).
An example of logging in wherein NestDB is in an uninterruptable sleep state is below:
var/log/vmware/top-cpu.log:
Tue Sep 05 16:22:17 UTC 2025PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ TGID COMMAND2##2 nestdb 20 0 83212 24180 14576 D 16.5 0.1 0:00.17 2092 /opt/vmware/nsx-nestdb/bin/nestdb-server --schema /opt/vmware/nsx-nestdb/schema/nestdb.schema --dat+
Please reference Manpages for ubuntu pidof8 or Why is pidof not working for further context.
This is not done because it can cause pidof and calling scripts to hang in such cases.
Fixed in NSX 4.2.0
Workaround:
There is no workaround to avoid the issue, but the risk can be avoided by ensuring a healthy infra/disk.
To recover, the corresponding Edge can be rebooted.
Some log entries found on the affected edge node in Syslog:
2025-04-24T14:39:39.017Z edgenode NSX 1 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="nestdb" level="ERROR" errorCode="EDG0000057"] DB is not connected while performing write operation
2025-04-24T14:39:39.004Z edgenode nsxa-systemd-helper 7467 - - 2025-04-24T14:39:39Z nsxa 1 nestdb [ERROR] DB is not connected while performing write operation errorCode="EDG0000057"
2025-04-24T14:39:39.164Z edgenode nsxa-systemd-helper 7467 - - 2025-04-24T14:39:39Z nsxa 1 nestdb [ERROR] DB is not connected while performing write operation errorCode="EDG0000057"