eBGP between DLR and Edge may experience a routing loop after upgrading to NSX-v 6.3.5
book
Article ID: 320889
calendar_today
Updated On:
Products
VMware NSX Data Center for vSphere
Issue/Introduction
This article applies if NSX BGP configuration satisfies all of the following conditions:
A private Autonomous System (AS) range (64512 to 65535) is used to establish BGP peering within the network domain.
The Distributed Logical Router (DLR), Edge Services Gateways (ESGs), and upstream routers are configured to peer via external BGP (eBGP) using private AS numbers.
Equal-Cost Multi-Path (ECMP) routing is utilized between the DLR and ESGs to provide path redundancy and improved throughput.
The environment is running NSX for vSphere version 6.3.5.
When all of the above conditions are observed, routing loops can be seen in NSX domain:
Running the show ip bgp command from the DLR or ESG, routes from eBGP neighbors do not display the full AS path.
On one or more ESGs, northbound routes are pointing to the DLR as the next hop can be seen.
Environment
VMware NSX Data Center for vSphere
Cause
There are two parts to this problem. Together they cause this issue.
In NSX for vSphere 6.3.5 and earlier, the ESG/DLR strips the private AS before advertising to eBGP peers, causing loss of AS-Path information.
In NSX for vSphere 6.3.5, send-side loop detection is disabled for BGP, which can potentially cause a routing loop.
Resolution
This issue is resolved in:
VMware NSX for vSphere 6.3.6.
VMware NSX for vSphere 6.4.0.
Workaround: Do one of the following:
Allow NSX internal routes to upstream neighbors for DLR. Implement the outbound filters on the DLR towards the ESGs to only allow the internal networks.
OR
Configure iBGP between the DLR and ESGs.
Note: For the second workaround, Add a default gateway on the DLR, or the network between the ESG and upstream router, with the ESG as the next hop. or, implement default information originate on all of the ESGs. If implemented default information originate on ESGs, Apply the appropriate filters to block the propagation of the default route on the upstream router.
Additional Information
Impact/Risks: Routing loops can occur within the BGP domain, which can cause North/South connectivity issues.