NSX-T VDR cannot resolve SR backplane MAC address when both the workload VM and the Edge VM are on the same host.
search cancel

NSX-T VDR cannot resolve SR backplane MAC address when both the workload VM and the Edge VM are on the same host.

book

Article ID: 376796

calendar_today

Updated On:

Products

VMware NSX Networking VMware NSX-T Data Center VMware NSX

Issue/Introduction

In NSX-T environments, the Virtual Distributed Router (VDR) may fail to resolve the Service Router (SR) backplane MAC address if both the workload VM and the Edge VM are on the same host. This issue arises from having two different values for the same configuration key (com.vmware.port.extraConfig.vdl2.nestedTNConfig) in the logical port configuration.

The problem occurs in a collapsed-cluster environment where the Edge VM (with the active SR instance) and the workload VM (attempting a northbound connection) are on the same host. Conflicting VTEP labels are found in the extra_config sent by the Management Plane (MP) for LogSwitchPortConfigMsg, leading to undefined behavior in the configuration agent (cfgAgent). This may result in incorrect labels at the Distributed Virtual Switch (DVS) layer. The VTEP label at the DVS layer should match the one in cfgAgent.

Example Object Type: vmware.nsx.nestdb.LogSwitchPortConfigMsg

{'id': {'left': ###########, 'right': ###########},
 'log_switch_id': {'left': ###########, 'right': ###########},
 'attachment': {'vif_attachment': {'vif_id': '######################', 'type': 'INDEPENDENT'}},
 'ip_discovery': {'arp_snooping_enabled': True,
                  'dhcp_snooping_enabled': True,
                  'vm_tools_enabled': True,
                  'arp_bindings_limit': 1,
                  'nd_snooping_enabled': False,
                  'dhcpv6_snooping_enabled': False,
                  'nd_bindings_limit': 3,
                  'vm_tools_v6_enabled': False,
                  'expiry_arp_nd_timeout': 10,
                  'trust_on_first_use_enabled': True,
                  'duplicate_ip_detection_enabled': False,
                  'is_default': True},
 'switch_security': {'config_spoof_guard': {'enable': False, 'port_enable': False},
                     'dhcp_client_filter': False,
                     'dhcp_server_filter': True,
                     'bpdu_filter': True,
                     'config_rate_limit': {},
                     'dhcpv6_client_filter': False,
                     'dhcpv6_server_filter': True,
                     'ra_guard': True},
 'qos': {'dscp': {'trust_mode': 'TRUSTED', 'dscp_value': 0},
         'cos': 0,
         'shaper_config': [{'type': 'INGRESS_RATE', 'average': 0, 'peak': 0, 'burst': 0}]},
 'admin_state_up': True,
 'mac_management': {'mac_change_allowed': True,
                    'mac_learning': {'unicast_flooding_allowed': False,
                                     'aging_time': 600,
                                     'enabled': True,
                                     'mac_limit': {'limit': 4096, 'policy': 'ALLOW'}}},
 'extra_config': [{'key': 'com.vmware.port.extraConfig.vdl2.nestedTNConfig',
                   'value': 'version=1;vlan=###,label=XXXXX;vlan=###,label=XXXXX'},
                  {'key': 'com.vmware.port.extraConfig.vdl2.nestedTNConfig',
                   'value': 'version=1;vlan=###,label=YYYYY;vlan=###,label=YYYYY'}]}

Observations:

  • The logical port contains the key com.vmware.port.extraConfig.vdl2.nestedTNConfig with differing values in extra_config and system_extra_config.
  • On the host, there may be invalid values for the key com.vmware.port.extraConfig.vdl2.nestedTNConfig. Use commands like net-dvs -l .
  • ARP resolution by VDR fails.

 

Relevant Logs

  • NSX API Log: /var/log/proton/nsxapi.log
  • Host Logs: net-dvs_l.txt
  • CCP Dumps: data_dump and adaptor_ufo_dump

Environment

Impacted Version:

All versions before NSX 4.2.1

Cause

The issue arises due to having two different values for the same configuration key (com.vmware.port.extraConfig.vdl2.nestedTNConfig) in the logical port configuration. This discrepancy results in network traffic failures when VMs and the Edge VM are on the same host, potentially causing significant disruptions.

Resolution

Resolution:

This issue is resolved in NSX 4.2.1.The resolution involves preventing the update of logical ports with conflicting values for the same key.

Workaround: 

1. Disconnect the Edge network interfaces from the Manager UI. Temporarily connect the interfaces to a different segment.
2. Reconnect the Edge network interfaces to the original segment via the Manager UI.

Additional Information

If you suspect you are experiencing this issue and need assistance with validation, please open a support case with Broadcom.