Edge Health - Failure Domain Down - Edge in Degraded status and Controller in disconnected state
search cancel

Edge Health - Failure Domain Down - Edge in Degraded status and Controller in disconnected state

book

Article ID: 398172

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

- We can see an alarm generated in the NSX manager:

- From the logs, we could see there was a network issue towards one of the manager during the time stamp that this alarm was first generated:

2025-05-03T12:52:18  EDGE-Node NSX 5234 - [nsx@6876 comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="5921" level="INFO"] RpcConnection[346 Connected to ssl://#.#.#.#:1235 0] Closing (network error)

- Now checked the ping to all managers from the Edge and also netcat to verify the port connectivity on TCP 1234 and 1235 (nc -zv <IP of Manager> 1234, nc -zv <IP of Manager> 1234) all of them works fine

> get managers --> shows connected

> get controllers --> Shows disconnected with Network Error as the reason

> Also verified that /etc/vmware/nsx/controller-info.xml has the controller info:

Environment

VMware NSX

Cause

Edge connectivity issues towards the manager causing the Edge Health Failure Domain Down Alarm in NSX manager

Resolution

To resolve this issue restart NSX proxy from this Edge Transport Node: (The nsx-proxy service connects to CCP on NSX Manager appliance to get new configurations)

/etc/init.d/nsx-proxy restart 

After the nsx-proxy restart, again issue the command: get controllers --> now we can see its in connected state:

Additional Information