After replacing the APH-AR certificate, the connections between the Manager and the TN node were disconnected.
search cancel

After replacing the APH-AR certificate, the connections between the Manager and the TN node were disconnected.

book

Article ID: 417151

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Replaced the APH-AR certificate.
  • Transport Node (TN) connection shows disconnected

GET /api/v1/transport-zones/transport-node-status
...
"node_status": {
  "host_node_deployment_status": "HOST_DISCONNECTED",
  "inventory_sync_paused": false,
  "last_heartbeat_timestamp": #############,
  "last_sync_time": #############,
  "lcp_connectivity_status": "UNKNOWN",
  "lcp_connectivity_status_details": [],
  "mpa_connectivity_status": "DOWN",
  "mpa_connectivity_status_details": "Client has not responded to {2} consecutive heartbeats. Port {1234} between Host to NSX Manager must be open, Please check underlay physical firewalls and host hypervisor firewalls for troubleshooting.",
....

  • Log analysis from the TN shows that although the proxy connection initially established successfully, the connection dropped about 24 hours later and could not be re-established. The logs indicate a certificate validation failure related to CRL checking.
  • /var/log/syslog*

YYYY-MM-DDT00:33:55.343Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" tid="2638378" level="INFO"] [PROXY-MAIN] Start nsx proxy
YYYY-MM-DDT00:33:55.343Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" tid="2638378" level="INFO"] [PROXY-MAIN] Build Number: 24150866
YYYY-MM-DDT00:33:55.343Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" tid="2638378" level="INFO"] [PROXY-MAIN] Running on node: nsx-edge
YYYY-MM-DDT00:33:55.933Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-net" tid="2638400" level="INFO"] StreamSocket[8 Open f:41 i:1919881693 ? -> ssl://###.###.###.###:1234] on_connect 0-Success
YYYY-MM-DDT00:33:55.933Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-net" tid="2638400" level="INFO"] StreamConnection[8 Connected to ssl://###.###.###.###:1234 sid:8] Connected from ssl-tcp://###.###.###.###:39712 to server with certificate with sha256 fingerprint '################################################################'

YYYY-MM-DDT23:49:03.448Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-net" tid="2638400" level="INFO"] StreamSocket[8 Closing f:41 i:1919881693 ###.###.###.###:39712 -> ssl://###.###.###.###:1234] DoClose
YYYY-MM-DDT23:49:03.449Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="2638400" level="INFO"] ConnectionKeeper[4 ssl://###.###.###.###:1234] resetting connection, will reconnect
YYYY-MM-DDT23:49:03.449Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="2638400" level="INFO"] ConnectionKeeper[4 ssl://###.###.###.###:1234] closing and releasing connection cid:8
YYYY-MM-DDT23:49:03.449Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="2638400" level="INFO"] RpcConnection[8 Closed to ssl://###.###.###.###:1234 0] Closing (closed by user)
YYYY-MM-DDT23:49:03.449Z hostname NSX 2638378 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="mpa-proxy-lib" tid="2638378" level="INFO"] AphConnectionManager: ConnectionDown for endpoint ssl://###.###.###.###:1234 for reason remote certificates CRL validation failed

Environment

VMware NSX versions prior to 4.2.1

Cause

When NSX Manager uses a CA-signed certificate, the Host-MP connection is terminated about 24 hours after establishment due to a defect in the CRL manager workflow on the MP.  
After the connection is dropped, the host is unable to reconnect because the CRL validation continues to fail.

Resolution

1. Use a self-signed APH certificate. This bypasses the faulty CRL validation logic impacting CA-signed certificates.

2. Upgrade to NSX 4.2.1 or later.