BGP IPv6 sessions do not get established after NSX-T Edge Node fail over
search cancel

BGP IPv6 sessions do not get established after NSX-T Edge Node fail over

book

Article ID: 317486

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • This issue could affect any versions of NSX-T.
  • BGP IPv6 sessions are failing to get established on the NSX-T Edge Nodes.
  • You have recently performed a fail over from an active NSX-T Edge Node to the standby NSX-T Edge Node.
  • You may have noticed network disruptions between the Tier-0 Gateway and the physical network fabric.
  • In the NSX-T Edge Node log file /var/log/syslog.log you may observe MAC address to IP address changes:
NsxEdge NSX 7444 - [nsx@6876 comp="nsx-edge" subcomp="nsd" tid="7444" level="INFO"] Successfully added arp for ####:####:####:####::# via 00:50:56:##:##:##, dev: br-######, type: 1
NsxEdge NSX 7444 - [nsx@6876 comp="nsx-edge" subcomp="nsd" tid="7444" level="INFO"] Successfully added arp for ####:####:####:####::# via 00:50:56:##:##:##, dev: br-######, type: 1
  • In the NSX Manager support bundle dump file /controller/adaptor-ufo/adaptor_ufo_dump you may see that Trust On First Use (TOFU) is enabled:
ip_discovery_switching_upm_profile {
   managed_resource {
   }
   dhcp_snooping_enabled: true
   arp_snooping_enabled: true
   vm_tools_enabled: true
   arp_bindings_limit: 1
   dhcp_v6_snooping_enabled: true
   nd_snooping_enabled: true
   nd_bindings_limit: 3
   vm_tools_v6_enabled: true
   trust_on_first_use_enabled: true <=================
   arp_nd_binding_timeout: 20
   l2_mp_profile_id: "########-####-####-####-############"
   profile_type: PROFILE_TYPE_IP_DISCOVERY_SWITCHING_UPM_PROFILE
 }
 
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.


Environment

VMware NSX

Cause

There can be two issues that may cause the BGP IPv6 sessions to not established after NSX-T Edge Node fail over:
  1. The IP address shifted between the active and standby NSX-T Edge Nodes during the Edge fail over. Due to TOFU being enabled the shifting of IP address from the active to the standby NSX-T Edge Node will cause the communication to be blocked as TOFU does not get updated.
  2. The Network Discovery limit (ND) was set to a lower value than the required ND bindings.

Resolution


Workaround:
There are two workarounds that can be applied to solve the issue. The workaround to fix the issue depends on the cause of the issue.
  1. If the issue was caused by TOFU being enabled, disable TOFU in the IP discovery profile.
  2. If the ND limit is being exceeded, increase the ND binding limit.
See the NSX documentation for to modify the IP discovery profile.