On an NSX-enabled ESXi host, the nsx-syslog can become excessively flooded with repeated log entries originating from the NSX System Health Agent (SHA).
nsx-syslog is flooded with millions of repeated log entries like:MM: connection 52 closed at remote side @<timestamp>
nsx-sha agent and are often accompanied by errors such as:failed to close 52: [Errno 104] Connection reset by peerClientStopException
Advanced diagnostic logs may show internal socket structures, indicating an open file descriptor (fd) is repeatedly reported as an input, even after close attempts:MM:inputs [<socket.socket fd=47, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 11111)>, 52, 51, 59, 53]
Pre NSX-T 9.0
This issue is caused by an internal socket within the NSX System Health Agent (SHA) becoming stuck in an infinite error loop after a client-side connection is abruptly closed.
When a client closes the connection, the SHA (netopad service) tries to shut down its end. The shutdown fails with:
[Errno 104] Connection reset by peer
This close failure prevents the removal of the affected file descriptor (FD) from the input set used by select(). Consequently:
select() repeatedly reports the FD as ready for read
The read fails
Close fails again
This forms an infinite cycle:
read ready → read failure → close failure → repeat
This loop prevents the cleanup routine from progressing and floods the logs with repeated error messages.
The issue can be resolved by restarting the NSX System Health Agent (netopad service) on the affected ESXi host. This action resets the agent's internal connection states and clears the stuck file descriptor, effectively stopping the log flood.
Steps:
Restart the netopad service: Execute the following command:
/etc/init.d/netopad restart
This command will briefly stop and then restart the NSX Secure Host Agent. There might be a momentary interruption in NSX-related host monitoring or enforcement managed by this agent, but it is typically transient.
Verify Resolution: After a few moments, check the nsx-syslog to confirm that the repeated connection closed at remote side and Connection reset by peer messages have ceased.
Note: This command is non disruptive to Datapath