Connectivity lost when Edge failover occurs with only static route in use.
search cancel

Connectivity lost when Edge failover occurs with only static route in use.

book

Article ID: 410370

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Northbound connectivity is lost for traffic through a Tier-0 that relies on a static route after a failover active to standby nodes.
  • NSX traceflow from southbound VM's to external IPs shows "No Route" in the Tier-0 even though static route is apparently configured.
  • Static route has scope set on next hop.
  • When validating the static routes presence on the Edge node via CLI the static route is only present on one node (that is now standby). The active node will be missing the required / expected routes. To verify the presence of the route on the edge nodes the following can be done
    1. SSH to Edge node housing the Tier 0.
    2. Login as admin.
    3. Run "get logical-routers" .
    4. Identify the SR of the Tier-0 and note its VRF number.
    5. Run "vrf <VRF ID obtained in step 4>.
    6. Run "get routes".
    7. Within the routes listed, static route required for northbound connectivity is missing on only one node.

Environment

VMware NSX
VMware NSX-T Data Center

Cause

This issue may be caused by the static route having been configured with a scope set for its next hop. When the next hop value is created it can have a scope defined, this scope will set where the static route is applied.

A scope can be an interface, a gateway, an IPSec session, or a segment. If a value is set here such as an interface it needs to be applied to an interface on both edge nodes. If only one interface is selected, such as an interface on edge 1 then edge 2 will not have the static route realised. In the below image, only an interface from edge 2 is in the scope so edge 1 will not have the static route pushed to it. Meaning when edge 2 is failed over to edge 1, no valid realisation interface is present so the static route is not applied.



Resolution

Ensure either no scope is set or that the scope includes an interface of all edge node tier-0s. An immediate temporary solution is to fail back to the original active node that was working prior to fail over as this one originally had the route.

Additional Information