A similar alarm was frequently reported in the NSX UI and resolved shortly after.
2024-10-20T13:33:51.159Z edge.local NSX 5829 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="CRITICAL" eventFeatureName="edge_health" eventType="failure_domain_down" eventSev="critical" eventState="On" entId="2211cef2-xxxx-xxxx-xxxx-7d98603929e2"] All members of failure domain 2211cef2-xxxx-xxxx-xxxx-7d98603929e2 are down.
2024-10-20T13:34:52.644Z edge.local NSX 5829 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="CRITICAL" eventFeatureName="edge_health" eventType="failure_domain_down" eventSev="critical" eventState="Off" entId="2211cef2-xxxx-xxxx-xxxx-7d98603929e2"] All members of failure domain 2211cef2-xxxx-xxxx-xxxx-7d98603929e2 are reachable.
VMware NSX 4.x
The alarm arose due to the edge node having a networking connection issue with the NSX manager and reconnected back repeatedly.
NSX manager controller logs:
------------------------2024-10-22T16:31:58.515Z INFO CCP-5ae7a2dc-xxxx-xxxx-xxxx-d88c3d6747e0:worker-2 NettyConnection 1297606 - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="ccp"] Connection closed received NettyConnection(NettyChannel(local=xx.xx.xx.xx:1235, remote=xx.xx.xx.xx:58231), active=false)
2024-10-22T16:31:58.516Z INFO nsx-rpc:CCP-5ae7a2dc-xxxx-xxxx-xxxx-d88c3d6747e0:user-executor-3 VersionMastershipServiceImpl 1297606 - [nsx@6876 comp="nsx-controller" level="INFO" subcomp="handshake-server"] closeStream id b0de2395-xxxx-xxxx-xxxx-526aa3e9defe status Status(code=COMMUNICATION_ERROR, msg=null)
------------------------
Edge node logs:
------------------------
Connection to MP through 1234:
2024-10-22T14:07:35.919Z gldc13-edge1.danfoss.local NSX 5211 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="5967" level="INFO"] RpcConnection[4455 Closed to ssl://xx.xx.xx.xx:1234 0] Notifying channels on connection down (network error)
Connection to CCP through 1235:
2024-10-22T15:01:17.993Z gldc13-edge1.danfoss.local NSX 5211 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="5967" level="INFO"] RpcConnection[4572 Closed to ssl://xx.xx.xx.xx:1235 0] Notifying channels on connection down (network error)
This is not an NSX issue. Stabilize infra networking is needed.