Traffic interruption observed after NSX Tier-0-Gateway HA VIP failover from active to standby edge node
search cancel

Traffic interruption observed after NSX Tier-0-Gateway HA VIP failover from active to standby edge node

book

Article ID: 414391

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Traffic interruption is observed after NSX Tier-0 gateway HA VIP failover from active to standby edge node.
  • Traffic flow resumes automatically in approximately 20 minutes or less.
  • Active edge node (now standby) logs show the HA VIP being removed when failover is initiated:
  • tail -f /var/log/syslog | grep <HA VIP>
    <TIMESTAMP> <EDGE NODE NAME> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="lrport" level="INFO"] Del <HA VIP/CIDR> type 4 to lport <LOGICAL PORT UUID>
    <TIMESTAMP> <EDGE NODE NAME> NSX 11238 - [nsx@6876 comp="nsx-edge" subcomp="nsd" tid="11238" level="INFO"] Detected IP <HA VIP/CIDR> deletion for interface uplink-262
    <TIMESTAMP> <EDGE NODE NAME> NSX 11238 - [nsx@6876 comp="nsx-edge" subcomp="nsd" tid="11238" level="INFO"] Removed EdgeSvcBinding Id <UUID> for IP <HA VIP/CIDR> on uplink-262, IP ref count is 0
    <TIMESTAMP> <EDGE NODE NAME> NSX 11238 - [nsx@6876 comp="nsx-edge" subcomp="nsd" tid="11238" level="INFO"] Remove interface uplink-262 IP <HA VIP/CIDR>
    
  • Standby edge node (now active) logs show the HA VIP being added and Gratuitous ARP (GARP)messages being sent:
     tail -f /var/log/syslog | grep <HA VIP>
    <TIMESTAMP> <EDGE NODE NAME> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="ha-cluster" level="INFO"] HA port <HA PORT UUID> IP <HA VIP/CIDR> type 4
    <TIMESTAMP> <EDGE NODE NAME> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="lrport" level="INFO"] Add <HA VIP/CIDR> type 4 to lport <HA PORT UUID>
    <TIMESTAMP> <EDGE NODE NAME> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="lrouter" level="INFO"] Add cpu-port route for <HA VIP/CIDR> on <HA PORT UUID>
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="dpc-pb" tname="dp-ipc19" level="INFO"] Add IP <HA VIP/CIDR> to lrouter port <HA PORT UUID>
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-ipc19" level="INFO"] announce type neigh entry (<HA PORT UUID>, <HA VIP>) with <HA PORT UUID> is created, dad_state T, prefix_len 0
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="dpc-pb" tname="dp-ipc19" level="INFO"] Update lrouter <LOGICAL ROUTER UUID>'s FIB entry for <HA VIP/CIDR>
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] soliciting #1 (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #1, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 10988 - [nsx@6876 comp="nsx-edge" subcomp="nsd" tid="10988" level="INFO"] Added EdgeSvcBinding Id <EDGESVCBINDING UUID> to IP <HA VIP/CIDR> for interface uplink-275, IP ref count is 1
    <TIMESTAMP> <EDGE NODE NAME> NSX 10988 - [nsx@6876 comp="nsx-edge" subcomp="nsd" tid="10988" level="INFO"] Add interface uplink-275 IP <HA VIP/CIDR>
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #2, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #3, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #4, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #5, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #6, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #7, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #8, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #9, announcing (<HA PORT UUID>, <HA VIP>)
    <TIMESTAMP> <EDGE NODE NAME> NSX 11320 SWITCHING [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="neigh" tname="dp-learning3" level="INFO"] retry #10, announcing (<HA PORT UUID>, <HA VIP>)
  • Packet captures on the active edge node (now standby) edge hosts vmnic shows GARP messages being sent at failover:
    pktcap-uw --uplink <VMNICX> --capture UplinkSndKernel -o -| tcpdump-uw -enr - | grep <HA VIP>

    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has <HA VIP> (ff:ff:ff:ff:ff:ff) tell 0.0.0.0, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
  • Packet captures on the standby edge node (now active) edge host vmnic shows GARP messages being sent at failover:
    pktcap-uw --uplink  <VMNICX>  --capture UplinkSndKernel -o -| tcpdump-uw -enr - | grep <HA VIP>

    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Request who-has <HA VIP> (ff:ff:ff:ff:ff:ff) tell 0.0.0.0, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
    <TIMESTAMP> <HA VIP INTERFACE MAC> > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Reply <HA VIP> is-at <HA VIP INTERFACE MAC>, length 46
  • Packet captures on the upstream device shows that GARP messages are being received but the L3 devices ARP cache is not updated at this time.

Environment

VMware NSX-T Data Center

VMware NSX

Cause

The upstream L3 device is configured to ignore incoming GARP replies.  This is a security feature that is used by numerous vendors by default.

When the edge HA VIP fails over, its GARP is ignored by the upstream L3 device, preventing the upstream devices ARP cache from being updated.  Traffic flow resumes automatically in approximately 20 minutes or less (when the upstream devices stored ARP cache entry for the HA VIP times out).

Resolution

This is a known issue impacting VMware NSX.

 

Workaround

It may be possible to disable this security feature on the upstream L3 device interfaces.

NB:  Reference the relevant L3 device vendors documentation for further information.

Additional Information