Multi-site NSX deployments typically require BGP for external connectivity due to the complexity and scale of enterprise networks using VMware NSX. BGP is the standard routing protocol for NSX implementations, providing the inter-domain routing intelligence and flexibility needed for enterprise network architectures.
Even in deployments where multiple ISPs are not immediately required, implementing BGP from the start prepares the infrastructure for future growth and changing connectivity needs. Configuring both eBGP and iBGP establishes the foundation for multi-ISP architecture, ensuring the network can scale without requiring complete routing redesign.
A small minority of NSX deployments use OSPF with static routing for external connectivity. This occurs in limited scenarios where all sites connect through a single ISP with unified routing, and typically represents infrastructures that have not yet encountered the need for multi-ISP connectivity.
This article provides architectural guidance for routing protocol selection in multi-site NSX environments, with primary focus on BGP implementations that serve the majority of deployments.
BGP Architecture (Standard Deployment):
BGP is standard for NSX deployments because it provides:
NSX customers typically operate sophisticated network environments requiring:
OSPF with Static Routing (Small Minority of Deployments):
A small minority of NSX deployments use OSPF for inter-site routing with static routes for external connectivity. This occurs only when:
This architecture exists in infrastructures that have not yet encountered the need for multi-ISP connectivity, but most NSX implementations benefit from BGP from initial deployment.
Preparation for Growth:
Implementing BGP with both eBGP and iBGP from initial deployment prepares the network for future requirements. Even if multiple ISPs are not immediately needed, the architecture supports adding additional upstream providers without redesign. As organizations expand to new sites or regions, ISP options vary based on availability, cost, and performance. Starting with BGP avoids costly rearchitecture later.
Content Delivery and Cloud Integration:
Modern enterprise applications rely heavily on content delivery networks, SaaS platforms, and public cloud services. These providers announce their prefixes through BGP with specific peering requirements and path preferences. Optimal access requires BGP intelligence to understand how different paths reach these destinations.
NSX Customer Profile:
Organizations deploying NSX typically have:
The deployment of NSX itself indicates network complexity that benefits from BGP capabilities.
The architectural requirements for NSX deployments typically necessitate BGP due to the scale, complexity, and growth trajectory of enterprise networks.
Future-Proofing Architecture: Even when multiple ISPs are not immediately required, BGP provides the foundation for scaling without rearchitecture. Configuring eBGP and iBGP establishes the routing framework that accommodates growth. As organizations expand geographically or add sites, different ISP options become available or necessary in each region. BGP architecture handles this evolution seamlessly.
Inter-Domain Routing Requirements: BGP is specifically designed as an Exterior Gateway Protocol for routing between autonomous systems. It provides destination-specific routing intelligence that adapts to diverse upstream connectivity. Each ISP has distinct peering arrangements, transit relationships, and AS paths to reach internet destinations. BGP understands and leverages these differences.
Content Delivery Network Integration: Content delivery networks including major providers announce their prefixes with specific BGP policies. Different upstream providers have negotiated different peering agreements with these content networks, resulting in varying path quality and performance. BGP provides the intelligence to select optimal paths based on destination requirements.
Multi-Site Architectural Considerations: In multi-site deployments with separated external routers at each datacenter, NSX Tier-0 Gateway HA mode selection depends on routing architecture. Each NSX Tier-0 Gateway has a single AS number. When the same AS number is announced from multiple sites to different upstream routers, this can create asymmetric routing if stateful traffic exits through one path and returns through another.
For deployments requiring Active-Standby mode across datacenters, BGP features including AS path prepending enable control over which site is preferred for specific routes. The Failover Domain feature allows designation of primary and secondary datacenters, ensuring that if the active Edge in the primary datacenter fails, another Edge in the same datacenter takes over before failing to the secondary site.
OSPF Limitations: OSPF is an Interior Gateway Protocol designed for routing within a single administrative domain. While effective for campus networks and data center fabrics, OSPF cannot:
OSPF and BGP often work together in network architectures. OSPF typically handles internal routing within datacenters between switches and routers, while BGP handles external connectivity when traffic needs to travel to other organizations, MPLS networks, or the internet. This complementary use allows each protocol to operate in its optimal domain.
However, for external connectivity and multi-site architectures with diverse upstream providers, BGP is required. OSPF's convergence speed advantage within a datacenter does not offset its fundamental inability to handle inter-domain routing requirements.
Static routes with fixed next-hop information lack any awareness of destination reachability characteristics or upstream path diversity.
Growth Constraints: OSPF with static routing cannot accommodate future multi-ISP requirements without complete routing rearchitecture. Migration from OSPF/static to BGP requires significant planning, implementation effort, and potential service disruption.
Upstream Dependency: Complete reliance on single provider's routing decisions with no autonomy over external path selection. Cannot influence or optimize paths to specific destinations based on performance or cost requirements.
Geographic Expansion: As organizations add sites in different regions, ISP availability and cost-effectiveness varies by location. Natural expansion drives diverse upstream connectivity requirements that OSPF cannot support.
Select the routing architecture based on deployment requirements, with BGP being the standard for NSX implementations.
Implement BGP at physical edge routers for upstream connectivity. Use iBGP between sites to share routing information. Integrate NSX Tier-0 Gateways with BGP for intelligent external path selection.
This represents the standard architecture for NSX deployments and should be the default choice for new implementations.
Active-Active vs Active-Standby Mode:
When deploying Tier-0 Gateways across multiple sites with separate external routers at each datacenter, the HA mode selection impacts routing architecture.
In Active-Active mode, all Edge nodes in the cluster actively forward traffic. However, each NSX Tier-0 Gateway has only one AS number. If the same AS number is announced from multiple sites to different ToR/Leaf switches, these switches receive the same AS number from different points in the network simultaneously. This can create asymmetric routing where stateful traffic exits through one route and returns through another, causing connection failures.
One solution is hosting all Edges for a Tier-0 Gateway on a single site connected to the same ToR switches, but this reduces datacenter redundancy.
In Active-Standby mode across sites, only one Edge has the active Tier-0 Service Router at a time. This prevents asymmetric routing issues in multi-site deployments with separated external routers. BGP features enable control over which site is active for specific Tier-0 Gateways.
Failover Domain Configuration:
For Edge VM deployments in Active-Standby mode, the Failover Domain feature allows designation of primary and secondary datacenters. Multiple Edge VMs can exist in the same Edge cluster at each datacenter. When the active Edge VM in the primary datacenter fails, another Edge in the same datacenter takes over before failing to the secondary site.
Failover Domain configuration requires API calls to:
This ensures proper failover behavior where failure within a site promotes another Edge in the same site before failing over to the remote site.
AS Path Prepending for Traffic Engineering:
AS path prepending controls routing preferences in multi-site deployments. When the same prefix is announced from both datacenters to Border Leaf switches, AS path prepending makes one path less preferred by artificially lengthening the AS path.
The configuration workflow includes:
By default, BGP selects the path with the shortest AS path. Prepending additional AS numbers to a route makes that path less desirable, directing traffic to the preferred datacenter.
This enables active-standby behavior across sites while maintaining BGP routing control. Different Tier-0 Gateways can have different primary sites, distributing uplink bandwidth across both datacenters effectively.
Physical Edge BGP Configuration:
Deploy Top-of-Rack switches or dedicated edge routers capable of handling BGP routing. For full internet routing tables, ensure routers have adequate memory and processing capacity. For partial tables or default route acceptance, resource requirements are lower.
Configure eBGP peering sessions with upstream providers. Establish BGP sessions using appropriate neighbor relationships, AS numbers, and authentication. Configure the routers to receive routing information based on organizational requirements - either full internet routing tables for maximum visibility or filtered prefixes for specific needs.
In multi-site deployments, Border Leaf switches at each datacenter serve as the central point for BGP routing tables. These switches receive routes from NSX Tier-0 Gateways, datacenter fabric switches, and external routers. Using paired Border Leaf switches with the same AS number at each site simplifies design and troubleshooting.
Route Acceptance Strategy:
Organizations choose between full routing tables or filtered prefix acceptance:
Full Internet Routing Tables:
Partial Tables or Default Plus Specifics:
Inter-Site iBGP Configuration:
Establish iBGP sessions between routing infrastructure at each site. Configure BGP peering using internal AS numbers and appropriate session parameters. Ensure next-hop reachability between sites through the inter-site connectivity links.
For deployments with multiple BGP-speaking routers, implement route reflectors to reduce iBGP mesh complexity. Route reflectors simplify configuration and improve scalability in larger topologies.
NSX Tier-0 BGP Integration:
Choose integration strategy based on infrastructure requirements:
Strategy A - BGP on NSX Tier-0 Gateway:
Configure BGP on NSX Tier-0 Gateway to peer with physical edge routers. NSX Edges participate directly in BGP and make routing decisions at the virtual infrastructure layer.
Enable BGP on Tier-0 Gateway through NSX Manager. Configure local AS number, BGP neighbors pointing to ToR router addresses, and route redistribution between BGP and connected routes. Set appropriate route filters and policies based on requirements.
For multi-site deployments, configure:
Organizations must determine whether full BGP routing tables on NSX Edges are needed based on infrastructure requirements, control granularity needs, and available resources.
Considerations for full BGP tables on NSX Edges:
Strategy B - Default Routes to Physical Edge:
Configure NSX Tier-0 Gateway to receive default routes from physical edge routers. Physical infrastructure handles BGP intelligence and path selection while NSX Edges forward internet-bound traffic without maintaining full routing tables.
Configure route redistribution on physical routers to provide reachability information to NSX environment. NSX Edges use default routing to reach physical infrastructure.
Considerations for this approach:
Route Filtering and Security:
When running BGP with multiple upstream providers (multihoming), proper route filtering is critical to prevent the autonomous system from becoming a transit AS. Without appropriate filters, internet traffic could pass through the AS between different ISPs, consuming bandwidth and router resources. This transit risk is a fundamental concern in multi-ISP BGP deployments.
Implement strict prefix filtering on all eBGP sessions. Configure inbound filters to prevent acceptance of invalid or unauthorized prefixes. Configure outbound filters to advertise only authorized organizational prefixes.
Preventing Transit AS:
Configure AS-path access lists and route maps to advertise only locally originated routes to upstream providers. This prevents routes learned from one ISP from being advertised to another ISP, which would make the AS a transit path for internet traffic.
Use AS-path filtering that permits only routes with empty AS paths (locally originated routes). Apply route maps to all eBGP neighbors that filter outbound advertisements to include only routes originating within the local AS. This ensures that routes learned from one upstream provider are never advertised to another provider.
Route Acceptance Strategies:
Organizations can implement different strategies for accepting routes from upstream providers:
Full Internet Routing Table: Accept all routes from each ISP for maximum routing intelligence and path selection capabilities. This approach provides complete visibility but requires adequate router resources.
Directly-Connected Routes: Accept only routes for networks directly connected to each ISP, combined with default routes for general internet connectivity. This reduces routing table size while maintaining some visibility into ISP-specific paths.
Default Routes Only: Accept only default routes from ISPs and advertise organizational prefixes. This minimizes routing table requirements but provides limited path selection intelligence. This strategy is particularly useful when routers risk being overwhelmed by large amounts of routing information from BGP peers. Filtering to accept only default routes controls the size of the local routing table without losing IP connectivity to remote networks. The default route learned from the BGP neighbor can be conditionally advertised based on the existence of other routes in the routing table.
Prefix lists configured to permit only the default route (0.0.0.0/0) enable this filtering. Filtering can be applied to both incoming advertisements (routes learned from neighbors) and outgoing advertisements (routes sent to neighbors), providing bidirectional control over routing information exchange.
The choice depends on organizational requirements for traffic engineering, available resources, and operational complexity tolerance.
Prevent route leaks between upstream providers through appropriate filtering. Configure maximum prefix limits to prevent routing table overflow from misconfigured peers. Implement prefix validation using available mechanisms.
Verify BGP session status and confirm all peering sessions establish successfully. Check that prefix counts align with expectations based on configured acceptance policies. Monitor for route flapping or stability issues.
Validate inter-site iBGP connectivity and confirm routing information is shared properly between all sites. Verify that all sites have reachability information for critical destinations.
Examine path selection for important destinations including content delivery networks and cloud services. Verify that best path selection follows configured policies and business requirements. Confirm AS path prepending is working as intended by examining BGP attributes for announced routes at Border Leaf switches.
If BGP is configured on NSX Tier-0 Gateway, verify BGP neighbor adjacencies through NSX Manager, review learned and advertised routes, and confirm route redistribution from connected segments functions correctly. Validate that route maps are applied correctly to BGP neighbors and AS path prepending values appear in route advertisements as expected.
For Active-Standby deployments with Failover Domain, verify that Edges in the primary datacenter are active and Edges in secondary datacenter are standby. Test failover scenarios by simulating Edge failures in the primary site and confirming that another Edge in the same site takes over before failing to the remote site.
Verify AS path prepending configuration by checking advertised routes at Border Leaf switches. Confirm that routes from the non-preferred datacenter have longer AS paths due to prepending, making them less desirable for inbound traffic.
Test connectivity from workloads at all sites to diverse internet destinations. Verify that critical services remain accessible and perform as expected.
Team Skills: BGP requires specialized knowledge. Network operations teams need expertise in BGP fundamentals, path selection algorithms, attribute manipulation for traffic engineering including AS path prepending, Failover Domain configuration via API, troubleshooting, and security best practices.
Documentation: Maintain comprehensive documentation including:
Change Management: Establish change control procedures for BGP modifications. Test routing policy changes including AS path prepending adjustments before implementation. Schedule changes during maintenance windows when appropriate. Document rollback procedures for BGP policy changes.
Monitoring: Implement comprehensive monitoring for:
Upstream Provider Relationships: Maintain active relationships with upstream providers. Establish escalation contacts for routing issues. Request notification of maintenance affecting BGP peering. Understand provider routing policies and how they interact with AS path prepending strategies.
Use OSPF for dynamic routing within and between NSX sites. Configure static routes for external connectivity through a single upstream provider.
This architecture is appropriate only for the small minority of NSX deployments that operate with a single upstream provider providing consistent routing across all locations.
Organizations should evaluate whether implementing BGP from the start provides better long-term value even in single upstream provider scenarios. Migration from OSPF/static to BGP later involves significant rearchitecture.
OSPF Configuration:
Configure OSPF for inter-site routing between NSX sites. Establish OSPF adjacencies between NSX Edges and physical infrastructure. Configure appropriate interface costs to influence internal path selection. Implement OSPF authentication for security.
Static Route Configuration:
Configure static default routes pointing to upstream provider gateways. For redundant connections to the same provider, configure multiple static routes with different administrative distances for automatic failover. Consider implementing active path monitoring if dynamic failover is required.
NSX Tier-0 Configuration:
Configure route redistribution on Tier-0 Gateway to advertise NSX segments into OSPF for internal reachability. Set filters to control advertisement scope. Configure default route pointing to physical infrastructure for external connectivity.
No Inter-Domain Intelligence: Static routes provide only next-hop information without destination-specific awareness. Cannot adapt to upstream provider routing changes or varying reachability characteristics for different destinations.
No Traffic Engineering Capabilities: Cannot implement AS path prepending or other BGP-based traffic engineering features for multi-site deployments. No ability to control which datacenter is preferred for specific traffic flows.
No Failover Domain Support: Advanced features like Failover Domain for Edge placement control are typically used in conjunction with BGP architectures for proper traffic engineering across sites.
Future Migration Complexity: When additional upstream providers become necessary or multi-site traffic engineering is required, complete routing architecture redesign is required. BGP cannot be added incrementally without disruption.
Limited Path Control: Cannot influence path selection for external traffic beyond basic primary/backup static route configuration. No granular control over paths to specific destinations or ability to implement active-standby behavior across sites with routing intelligence.
Upstream Dependency: Complete reliance on single provider's routing decisions with no autonomy over external path selection. Cannot optimize paths based on performance or cost requirements.
Growth Constraints: As deployment grows and additional sites are added in diverse regions, probability of requiring different upstream providers increases significantly. Architecture cannot accommodate this growth without redesign.
For the small minority of deployments that initially implement OSPF with static routing, migration to BGP requires planned rearchitecture:
Obtain AS number from regional registry or use private AS. Plan AS number strategy for multiple Tier-0 Gateways across sites. Establish BGP peering agreements with upstream providers. Size and deploy physical edge routers with adequate resources. Develop BGP policies for path selection, filtering, and traffic engineering including AS path prepending strategies. Ensure operations team has BGP expertise including API-based configuration for features like Failover Domain.
Configure eBGP sessions with upstream providers on physical routers. Receive and validate routing information. Test path selection without putting BGP into active forwarding path. Maintain existing OSPF/static routing during testing phase. Configure Border Leaf switches to serve as central BGP routing points for each datacenter.
Establish iBGP sessions between sites. Verify routing table synchronization across all locations. Test failover scenarios in controlled environments. Ensure next-hop reachability across sites.
Deploy BGP on NSX Tier-0 Gateways or configure default routes to intelligent physical edge based on chosen strategy. For multi-site deployments, configure IP prefix lists and route maps for AS path prepending. Configure Failover Domain using API if implementing active-standby mode across sites. Migrate traffic incrementally from static routes to BGP paths. Validate connectivity to all critical destinations throughout migration.
Complete transition to BGP for external routing. Remove static default routes once BGP is fully operational. Verify AS path prepending is controlling traffic flow as intended across sites. Test Edge failover within sites and across sites to validate Failover Domain behavior. Optimize BGP policies based on actual traffic patterns and requirements. Maintain OSPF for internal routing if desired or transition inter-site routing to BGP.
For NSX multi-site deployments, BGP represents the standard architecture. BGP provides the inter-domain routing intelligence required for enterprise networks and prepares infrastructure for future growth. Even when multiple upstream providers are not immediately needed, implementing both eBGP and iBGP establishes the foundation for scaling without rearchitecture.
In multi-site deployments with separated external routers per datacenter, BGP provides essential traffic engineering capabilities through features like AS path prepending and supports advanced deployment patterns using Failover Domain for optimal Edge placement control. These capabilities enable active-standby configurations across sites while distributing workload and maintaining routing intelligence.
OSPF with static routing serves only a small minority of deployments where all sites connect through a single upstream provider with consistent routing. Even in these scenarios, implementing BGP from initial deployment often provides better long-term value by avoiding costly rearchitecture as requirements evolve.
When designing new NSX deployments, BGP architecture should be the default choice. The investment in BGP expertise and infrastructure provides operational flexibility, reliability, traffic engineering capabilities, and performance optimization that align with the sophisticated networking requirements typical of NSX implementations.
If questions remain about routing architecture selection for specific deployment scenarios, please reach out to Enterprise Software Professional Services for architectural guidance.