Edge/ESX host are in failed state with error "Failed to send HostConfig RPC to MPA"
search cancel

Edge/ESX host are in failed state with error "Failed to send HostConfig RPC to MPA"

book

Article ID: 372128

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

ESXi hosts are in "Failed/Host Disconnected" status with below errors:

  • "Host configuration: Failed to send the HostConfig message. [TN=TransportNode/<uuid>]. Reason: Failed to send HostConfig RPC to MPA TN:<uuid>. Error: Unable to reach client <tn-uuid>, application SwitchingVertical."

  • "Failed to get response from NSX-SFHC component."

Environment

VMware NSX, VMware NSX-T Data Center

Cause

"Error 336134278-certificate verify failed" being noticed since the APH certificate is in revoked status, resulting in hosts and managers not being able to connect .

Resolution

To recover ESXi /Edges, following steps to be performed on all  faulty nodes:

/opt/vmware/nsx-nestdb/bin/nestdb-cli

# To verify the entry of revoked certificates, if any.

get vmware.nsx.nestdb.CrlCertificatesCacheMsg

# To delete the entry.

delete vmware.nsx.nestdb.CrlCertificatesCacheMsg {"id":0}

# Restart nsx-proxy:

/etc/init.d/nsx-proxy restart

Additional Information

ESXi Logs locations to verify the behaviour

/var/log/nsx-syslog:

--------------

YYYY-MM-DDTHH:MM:SS.780Z nsx-proxy[173598788]: NSX 173598788 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="173598836" level="ERROR" errorCode="RPC503"] RpcTransport[0]::RemoteService[31663ecf-2c68-4a2e-95e0-22b12d833cf0] Failed to resolve service: 336134278-certificate verify failed

YYYY-MM-DDTHH:MM:SS.780Z nsx-proxy[173598788]: NSX 173598788 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="173598836" level="WARNING"] Certificate validation: couldn't find SHA256 digest '75639adc921d8c3248e036e5712299173bb0e69574220fa775a0be25473c2a23' in local trust store

YYYY-MM-DDTHH:MM:SS.780Z nsx-proxy[173598788]: NSX 173598788 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="173598836" level="ERROR" errorCode="NET1111"] Certificate validation failed: 18-self signed certificate Certificate:   <certificate data>

YYYY-MM-DDTHH:MM:SS.780Z nsx-proxy[173598788]: NSX 173598788 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="173598836" level="WARNING"] StreamConnection[9989 Connecting to ssl://x.x.x.x:1234 sid:9989] Couldn't connect to 'ssl://x.x.x.x:1234' (error: 336134278-certificate verify failed)

YYYY-MM-DDTHH:MM:SS.780Z nsx-proxy[173598788]: NSX 173598788 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-net" tid="173598836" level="WARNING"] StreamConnection[9989 Error to ssl://x.x.x.x:1234 sid:-1] Error 336134278-certificate verify failed

...............


YYYY-MM-DDTHH:MM:SS.527Z nsx-proxy[170067301]: NSX 170067301 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="170067334" level="INFO"] RpcConnection[1373964 Connected to ssl://x.x.x.x:1234 0] Closing (remote certificates revoked)

YYYY-MM-DDTHH:MM:SS.528Z nsx-proxy[170067301]: NSX 170067301 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="170067334" level="INFO"] RpcConnection[1373964 Closed to ssl://x.x.x.x:1234 0] Notifying channels on connection down (remote certificates revoked)

YYYY-MM-DDTHH:MM:SS.565Z nsx-proxy[170067301]: NSX 170067301 - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="170067334" level="INFO"] RpcConnection[1373965 Connected to ssl://x.x.x.x:1234 0] Closing (remote certificates revoked)