- Following an upgrade, an Edge Node enters a "Ghost Edge" state where local health services appear functional, but the node is completely isolated from the cluster.
- Execution of get edge-cluster status confirms the Edge Node Status as Down and the High Availability State stuck in Discover
- Edge state is up on NSX UI and Logical routers show the total number on the NSX UI, but from CLI 'get logical-routers' comes back as blank
- BFD/Healthcheck Sessions to peers show a status of Unreachable.
VMware NSX 4.1.1
The Edge Node internal database fails to re-initialize following the upgrade, preventing the population of the logical routing table and the re-establishment of Geneve Tunnels (TEPs) and Management Plane communication.
Workaround:
Log in to the NSX Manager user interface.
Navigate to the Edge Transport Nodes section.
Deploy a replacement Edge Node
Validate the resolution by executing get logical-routers to ensure the expected T0/T1 instances are populated.
Execute get edge-cluster status to confirm the HA state has changed from Discover to Active or Standby, and that peer connectivity status is Up.