Broadcast, Unknown Unicast and Multicast (BUM) traffic is not functioning from some or all hosts between clusters.
search cancel

Broadcast, Unknown Unicast and Multicast (BUM) traffic is not functioning from some or all hosts between clusters.

book

Article ID: 303200

calendar_today

Updated On: 09-20-2024

Products

VMware NSX

Issue/Introduction

Symptoms:

  • Broadcast, Unknown Unicast and Multicast (BUM) traffic between hosts in different clusters is not functional (ie. ARP requests)
  • Some hosts may be successfully able to communicate via BUM traffic between clusters, while others may not.
  • Migrating a VM that can't communicate to another host in the same cluster may restore BUM traffic connectivity.
  • One or more hosts in the cluster are reporting a VXLAN configuration issue or are missing VTEPs in the NSX VXLAN transport view.
  • VMs on ESXi hosts in the same cluster are able to communicate without issue, including with unicast and BUM.

Environment

NSX for vSphere 6.x

Cause

When using Unicast or Hybrid replication modes, ESXi hosts send Broadcast, Multicast and Unknown Unicast (BUM) traffic via designated 'proxy VTEPs' - also called UTEPs or MTEPs - in the destination cluster. Traffic sent to this designated proxy VTEP has the 'replication flag' set and it's that host's responsibility to replicate these frames to all other VTEPs on the same network segment. If one or more hosts in the cluster are in a bad state from a VXLAN perspective, it's possible that they may not be replicating BUM traffic as they are supposed to.

Because each ESXi host in a cluster determines a proxy VTEP independently, it's possible that some hosts may communicate

To confirm that this is the case, it's necessary to determine the pattern of source/destination ESXi hosts that are not functioning. See if all instances where BUM traffic fails use the same UTEP/MTEP for replication. Be sure to always test in the same VXLAN, as each will have a different UTEP/MTEP selected.

This command displays the UTEP/MTEP selected by an ESXi host for a given VXLAN network.

# net-vdl2 -M vtep -s DVSWITCH1 -n 5002

VTEP count: 12
<snip>
Segment ID: 192.168.1.0
VTEP IP: 192.168.1.102
Flags: 1(MTEP)

Notes:

  • A flag of '1' will identify the UTEP/MTEP.
     
  • The host is using VTEP 192.168.1.102 as the proxy VTEP for the remote network 192.168.1.0/24

Resolution

If you experience all problematic data paths use the same proxy VTEP, you need to examine the host carefully to determine if it is the source of the problem. Look for any VXLAN configuration issues reported in the NSX UI, missing or duplicate VTEPs and issues of that sort.

To work around the issue, put the host in the maintenance mode and remove it from cluster so that NSX can no longer use it for proxy VTEP purposes in that network segment.