Following a recent manual certificate replacement (performed because the CARR script was not functioning at the time), all ESXi hosts in a VxRail cluster began showing a “Host Disconnected” status in the NSX UI.
The Host Disconnected error details for each Transport Node reported:
“Heartbeating between host and manager is down.”
The issue affected all hosts in the cluster simultaneously, rather than a single Transport Node.
Attempts to resolve the errors from the NSX UI failed immediately with no change in host status.
Network connectivity between ESXi hosts and NSX Managers was verified and confirmed healthy:
ICMP ping successful in both directions (hosts ↔ managers)
Netcat connectivity tests from the Host Transport Node CLI were successful on required ports:
nc -zv <NSX_Manager_IP/FQDN> 1234
nc -zv <NSX_Manager_IP/FQDN> 1235
These checks confirmed that the issue was not related to network connectivity, but instead pointed to a trust or certificate-related problem between NSX and the hosts.
VMware NSX 4.1.0.2
The issue was caused by a thumbprint mismatch between the registered Compute Manager (vCenter) and NSX, along with stale certificate entries remaining from the prior manual certificate replacement.
Workaround:
Download the latest CARR script using the Using Certificate Analyzer, Results and Recovery (CARR) Script to fix certificate related issues in NSX documentation.
Secure copy (SCP) the CARR script to the root directory of any NSX Manager, then extract and execute the script following the steps outlined in the same documentation.
Review the CARR script results and remediate all identified certificate and trust-related issues, including any Compute Manager thumbprint mismatches or stale certificates.
Once all fixes have been successfully applied, navigate to the NSX UI, select each host showing a Host Disconnected status, click View error details, and resolve the reported errors.
Confirm that all hosts transition to SUCCESS and UP status in the NSX UI.
Manual certificate replacement can leave stale certificate data and thumbprint mismatches if NSX is not updated accordingly.
The CARR script is the recommended tool for detecting and correcting certificate, trust, and thumbprint inconsistencies between NSX and vCenter.
When encountering widespread Host Disconnected states with heartbeat errors, and network connectivity is confirmed, certificate trust and Compute Manager thumbprints should be validated early in the troubleshooting process.