NSX-T T0 fails to learn T1 routes
search cancel

NSX-T T0 fails to learn T1 routes

book

Article ID: 320618

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • A T0 and T1 are configured
  • There is one Edge cluster which is associated with T1 & T0, and this cluster has two Edge nodes.
  • When a segment is created, one Edge has the segment route, and the other Edge is missing the segment route.  

T0-EDGE-1> show ip route vrf all | include <route>

 

T0-EDGE-2> show ip route vrf all | include <route>
S>* <route>/<mask> [3/0] via ###.##.###.#, downlink-<id>, <time>

  • From problematic Edge logs, the T1 Router link state is down

254: downlink-874@if11: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue master VRF-##### state DOWN group default qlen 1000
    link/ether ##:##:##:##:##:## brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0
    vlan protocol 802.1Q id 189 <REORDER_HDR>
    vrf_slave table 313 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet ###.##.###.#/31 scope global downlink-874
       valid_lft forever preferred_lft forever
    inet6 ####:####:####:####::1/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 ####::##:####:####:####/64 scope link nodad
       valid_lft forever preferred_lft forever

  • From problematic Edge, the syslog shows the same ifindex 254 value is observed in multiple entries

[TIMESTAMP] T0-EDGE-1 NSX 17 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="nestdb" level="INFO"] Handle MONITOR message type 48: NSD_NAMESPACE, update: 30123 bytes, delete: 29995 bytes
...
[TIMESTAMP] T0-EDGE-1 NSX 17 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="netns" 2023-05-18T01:01:43.161Z T0-EDGE-1 NSX 17 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="netns" level="WARN"] Unset ifindex 254 from uplink-675 254 1200 UP @ plr_sr
[TIMESTAMP] T0-EDGE-1 NSX 17 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="kni-port" level="INFO"] Set sub-interface downlink-874 ifindex to 254

  • edge/nsd-namespace has a stale entry using same ifindex

{
                "name": "uplink-<id>",
                "type": "VLAN",
                "if_id": <id>,
                "subnet": [
                    {
                        "ipv6": [
                            "####::###:####:####:####"
                        ],
                        "prefix_length": <prefix>
                    }
                ],
                "mtu": <mtu>,
                "admin_status": true,
                "vlan_id": <id>
            },



Environment

VMware NSX-T Data Center 3.2.x

Cause

Edge has a stale entry for ifindex, so the T1 Router link state is "down" resulting in the failure to advertise route.

Resolution

This issue is resolved in NSX-T 3.2.3.X

Place Edge Node in NSX Maintenance Mode, then exit Maintenance Mode. Perform this one at a time on the Edges to allow proper failover on the Edge Nodes.