ESXi hosts are in "Failed/Host Disconnected" status with below errors:
"Host configuration: Failed to send the HostConfig message.
[TN=TransportNode/<uuid>]. Reason: Failed to send HostConfig RPC to MPA TN:<uuid>. Error: Unable to reach client <tn-uuid>, application SwitchingVertical."
/var/log/syslog*
2025-06-08T14:47:47.602Z In(182) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="2100175" level="INFO"] ConnectionKeeper[6 ssl://NSX-MGR2:1234] attempting connection from timer callback
2025-06-08T14:47:47.602Z In(182) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="2100175" level="INFO"] StreamSocket[157416 Init f:-1 i:-1 ? -> ssl://NSX-MGR2:1234] Created
2025-06-08T14:47:47.603Z In(182) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="2100175" level="INFO"] RpcConnection[157416 Init to ssl://NSX-MGR2:1234 0] Queue threshold size 0
2025-06-08T14:47:47.603Z In(182) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="2100175" level="INFO"] StreamSocket[157416 Open f:54 i:0 ? -> ssl://NSX-MGR2:1234] async_connect
2025-06-08T14:47:47.617Z Wa(180) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="2100175" level="WARNING"] Certificate validation: couldn't find SHA256 digest <SHA256 digest of the APH-TN certificate of another Manager node> in local trust store
2025-06-08T14:47:47.617Z In(182) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="2100175" level="INFO"] StreamSocket[157416 Open f:54 i:0 ? -> ssl://NSX-MGR2:1234] on_connect 336134278-certificate verify failed
2025-06-08T14:47:47.617Z Wa(180) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="2100175" level="WARNING"] StreamConnection[157416 Connecting to ssl://NSX-MGR2:1234 sid:157416] Couldn't connect to 'ssl://NSX-MGR2:1234' (error: 336134278-certificate verify failed)
2025-06-08T14:47:47.617Z Wa(180) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="2100175" level="WARNING"] StreamConnection[157416 Error to ssl://NSX-MGR2:1234 sid:-1] Error 336134278-certificate verify failed
2025-06-08T14:47:47.617Z Wa(180) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="2100175" level="WARNING"] RpcConnection[157416 Connecting to ssl://NSX-MGR2:1234 0] Couldn't connect to ssl://NSX-MGR2:1234 (error: 336134278-certificate verify failed)
2025-06-08T14:47:47.618Z Wa(180) nsx-proxy[2100147]: NSX 2100147 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="2100175" level="WARNING"] RpcTransport[0] Unable to connect to ssl://NSX-MGR2:1234: 336134278-certificate verify failed
2025-06-10T19:33:52.794Z Wa(180) nsx-proxy[8827652]: NSX 8827652 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="8827674" level="WARNING"] Certificate validation: couldn't find SHA256 digest <SHA256 digest of the APH-TN certificate of another Manager node>' in local trust store
2025-06-10T19:33:52.794Z Er(179) nsx-proxy[8827652]: NSX 8827652 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="8827674" level="ERROR" errorCode="NET1111"] Certificate validation failed: 18-self signed certificate
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Certificate:
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Data:
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Version: 3 (0x2)
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Serial Number:
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: <Serial number of the certificate>
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Signature Algorithm: sha256WithRSAEncryption
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Issuer: C=US; ST=California; L=Palo Alto; O=VMware, Inc.; [email protected]; CN=VMware-NSX-ApplProxyHub; UID=4b######-####-4142-####-########cec1
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Validity
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Not Before: Mar 28 22:03:32 2024 GMT
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Not After : Mar 4 22:03:32 2124 GMT
2025-06-10T19:33:52.794Z Er(179)[+] nsx-proxy[8827652]: Subject: C=US; ST=California; L=Palo Alto; O=VMware, Inc.; [email protected]; CN=VMware-NSX-ApplProxyHub; UID=4b######-####-4142-####-########cec1
VMware NSX
Workaround is to create a new APH-TN certificate and replace against the affected Manager Node.
Note: Affected Manager Node IP address can be observed in the logs and also via get managers where the affected manager node will be on standby.
1. Generate a new self-signed certificate from NSX UI
2. Note down the impacted node UUID and run the following API call to apply new certificate
POST api/v1/trust-management/certificates/<certificate_id>?action=apply_certificate&service_type=APH_TN&node_id=<manager_node_id>
3. re-sync / click on failed node and resolve should fix the issue.
Also, running get managers on TN should give us all 3 managers in connected state as expected
ESXi01> get managers
Thu Jun 12 2025 UTC 16:23:13.256
- <NSX-MGR1> Connected (NSX-RPC) *
- <NSX-MGR2> Connected (NSX-RPC)
- <NSX-MGR3> Connected (NSX-RPC)