NSX BFD tunnels are down between hosts on opposite sides of a stretched cluster
search cancel

NSX BFD tunnels are down between hosts on opposite sides of a stretched cluster

book

Article ID: 426874

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • There is a stretched cluster with some hosts on physical site A and others on physical site B.
  • Different subnets and IP address pools are used to assign IP addresses to the vtep interfaces at each site. For example the following might be configured:
Site Name VTEP IP Pool Name Subnet Range
Site-A Site-A-Vtep-Pool 10.101.56.X
Site-B Site-B-Vtep-Pool 10.101.60.X
  • It is found that hosts are reported to be in a degraded state due to BFD tunnels being down.
  • The troubleshooting steps from the KB 379112 are carried out, and the following has been noted:
    • From site A all the tunnels that are reported as down, are tunnels to hosts in site B, and vice versa.
    • The vmkping tests between the vtep interfaces report 100% packet loss.
    • The BFD diagnostic code being reported is '0 - No Diagnostic'.
  • It is found that the vtep interfaces on the hosts either in site A or site B have been assigned an incorrect IP address.
  • For example hosts in site B are incorrectly assigned an IP addresses from the subnet 10.101.56.X from the IP pool 'Site-A-Vtep-Pool'.
  • This issue might have been preceded by a migration from DHCP to IP pools for vtep interface IP address assignment.

Environment

VMware NSX

Cause

  • It is found that the Transport Node Profile or Sub-Transport Node Profile assigned to the cluster in site A or site B is incorrectly configured to reference the wrong IP address pool.
  • For example there is a TNP profile assigned to the Site-A hosts which is configured to correctly use the IP pool 'Site-A-Vtep-Pool', but the Sub-TNP assigned to the Site-B hosts is also incorrectly configured to use the IP pool 'Site-A-Vtep-Pool'.

Resolution

  • Reconfigure the TNP or Sub-TNP assigned to the cluster to reference the correct vtep IP pool. For example, it the issue is at 'Site-B' then you would reconfigure its TNP or Sub-TNP to reference the IP pool 'Site-B-Vtep-Pool'.
  • The TNP can be edited using the following steps:
    1. In the NSX Manager UI go to System-> Fabric-> Hosts-> Transport Node Profile
    2. Filter for the name of the TNP assigned to the cluster.
    3. Click on the 'Host Switch' associated with that TNP.
    4. Click on the three vertical dots next to and to the left of the Virtual Distributed Switch entry listed in the new opened window.
    5. Find the 'IPv4 Pool' field and click the drop down menu in the options field.
    6. Select the correct vtep IP pool, then apply and then save. 
  • The steps for a Sub-TNP are the same as far as step 4, after that the following steps would need to implemented:
    1. Expand the 'SUB-TRANSPORT NODE PROFILE' listed.
    2. Click on the number listed under the Sub-TNP, which will open a new window.
    3. Click on the three vertical dots next to the Sub-TNP which needs to be reconfigured and select 'Edit'.
    4. Find the 'IPv4 Pool' field and click the drop down menu in the options field.
    5. Select the correct vtep IP pool, then 'Apply' and then 'Save'.

Additional Information