Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.
VMware NSX 9.x
VMware NSX 4.x
VMware NSX-T Data Center 3.x
This issue is cosmetic and due to a reporting defect in the NSX Manager UI aggregation logic. It affects the monitoring dashboard and alarms only. The actual network forwarding (NSX Data Plane) functions correctly via the remaining active NSX Edge node.
The TEP and BGP sessions on one Edge node go down completely while the Management plane connectivity to that Edge node remains active, the system incorrectly weighs the failure of the specific Edge node's routing components against the entire Gateway status. It fails to correctly aggregate the healthy status of the second node into a "DEGRADED" overall status and the system incorrectly interprets the loss of redundancy as a total loss of connectivity.
This issue is resolved in VCF 9.0.2, available at Broadcom downloads.
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.
To confirm the actual state of the environment, you can verify the status via CLI on the present NSX Edge node with Active status.
SSH the active NSX Edge Node as admin.
Verify BGP status:
get route bgp neighbor
Ensure state is Established.
Verify TEP status:
get tunnel-ports
Ensure tunnels are Up.
If the CLI confirms the second NSX Edge node is healthy, the UI status of "DOWN" can be disregarded as a false positive.