In an NSX Federation environment, L2 bridging fails to function across stretched segments, specifically impacting scenarios such as desktop migrations. While bridging may work correctly when traffic remains local to a single site, it fails in a multi-site configuration.
Symptoms:
Virtual Machines (VMs) are unable to obtain DHCP addresses across the bridge.
Traffic fails to reach the gateway through the active bridge member.
In a setup with multiple Edge nodes (e.g., Edge1 and Edge2), you may observe that the Standby (Blocking) node drops discovery traffic instead of forwarding it to the Active (Forwarding) node where the RTEP resides.
VMware NSX
The issue is caused by an interoperability limitation between NSX Federation and TEP (Tunnel Endpoint) Groups.
The root cause is the activation of the TEP Grouping feature, specifically defined by the global parameter enable_tep_grouping_on_edge being set to true.
To resolve the bridging failure, you must disable the TEP Grouping feature.
Follow these steps to update the global configuration via the NSX Policy API:
Use the NSX Policy API to retrieve the current connectivity-global-config settings.
Request:
GET /policy/api/v1/infra/connectivity-global-config
Example Output:
If TEP Grouping is active, the enable_tep_grouping_on_edge parameter will be set to true:
{
"global_replication_mode_enabled": false,
"is_inherited": false,
"site_infos": [],
"tep_group_config": {
"enable_tep_grouping_on_edge": true <------------- Check this value
},
"resource_type": "GlobalConfig",
"id": "global-config",
"display_name": "default",
"path": "/infra/global-config"
}
If the value is true, you must change it to false using a PUT call.
Request:
PUT /policy/api/v1/infra/connectivity-global-config
Request Body:
{
"global_replication_mode_enabled": false,
"is_inherited": false,
"site_infos": [],
"tep_group_config": {
"enable_tep_grouping_on_edge": false <------------- Check this value
},
"resource_type": "GlobalConfig",
"id": "global-config",
"display_name": "default",
"path": "/infra/global-config"
}
Note: Disabling this parameter allows the Standby Edge node to correctly handle and forward traffic to the Active Forwarding node, restoring L2 bridging functionality across sites.