Multicast not working after moving VM to CLOUD/Target location
search cancel

Multicast not working after moving VM to CLOUD/Target location

book

Article ID: 428714

calendar_today

Updated On:

Products

VMware HCX VMware NSX

Issue/Introduction

  • VM that leverages Multicast was moved onto a L2Extended network.  It is no longer receiving Multicast traffic.
  • Packet captures show "IGMP Group Join" packet from VM was dropped due to "VSwitch_FwdPolicyCheck".
    • This can be observed by using the command:< pktcap-uw --switchport <vm_switchportID> --capture VnicTx,VnicRx --mac <multicast_endpoint_mac> --trace >
       PktHandleID: ######=, Captured at PktFree point, Drop Reason 'VlanTag Mismatch'. Drop Function 'VSwitch_FwdPolicyCheck'. TSO not enabled, Checksum not offloaded and not verified, SourcePort <VM_Switchport>, VLAN tag <vLAN>, VLAN priority 0, QID 0, headroomlen 336, length 60.
  • Secondly, IGMP query from on-prem VDS is being dropped by physical routers before reaching the peer NE-R appliance.
    • This IGMP query/report is only sent from a DVS with "Multicast filtering mode" set to "IGMP/MLD snooping".
  • This can be seen by running TCPDUMP command on SOURCE NE-I appliance. In this example tapbr1 was used but you can use the v_Nic# corresponding to the stretched network you are concerned with. The source NE-I appliance shows the IGMP query/report but target NE-R appliance does not show this packet. For more information on how to packet-capture on HCX NE appliance, please see KB:389265
    root@###########-NE-I8 [ ~ ]# tcpdump -i tapbr1 host 224.0.0.1 or host 224.0.0.2 -ean
    
    tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
    
    listening on tapbr1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
    
    18:23:58.136503 ##:##:##:##:##:## > 01:00:5e:00:00:01, ethertype IPv4 (0x0800), length 60: #.#.#.#> 224.0.0.1: igmp query v2
  • The Target NE switchport is not tagged for "Multicast Membership". This can be verified by using the command:
    nsxdp-cli vswitch mcast_filter vswitch get --mode IGMP --dvs-alias <DVS-ALIAS>

Environment

VMware HCX 4.11.x

Cause

  • IGMP group join from VM is not reaching Multicast endpoint and because of this cannot register to receive updates form multicast group.
  • Because the IGMP Report packet is not reaching Target NE-R, this NE's switchport is not added to "Multicast Membership" list on NE-R Switchport.
    • The IGMP query being sent from VDS is set with a very low IP TTL (usually 1 or 2). The NE tunnels such as FOU, IPIP, and GRE are copying the TTL from the encapsulating packets as their IP TTL. 
    • This means the FOU/UDP packets have a TTL of 1 or 2. Because there are more than 2 routing nodes between NE-I and NE-R the packet TTL is decreased to 0 and is dropped by some router/device in-route. 
    • The Encrypted NE workflow uses IPSEC which ignores inner packet TTL and has a much larger TTL. This larger TTL is what's used by FOU to forward the packets towards peer NE-R appliance. 

Resolution

This issue is resolved in VMware HCX 4.11.4, available at Broadcom downloads. Refer VMware HCX 4.11.4 Release Notes
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB

Workaround 

This resolution involves making 2 changes. 

  1.  Update target ESX host advanced config to allow multicast packets to be forwarded to uplink. 
    • In VC, select ESX host and go to "configure" -->Advanced System Settings --> Edit -> Search for IGMP --> Modify the option Net.SendIGMPReportToUplink to 1 "
  2. Enable "NE Encryption" on the HCX Service-Mesh. For more info, please see: System Services - HCX 4.11 User-Guide

Note:

Upgrading HCX to 4.11.4 resolves the Encryption issue. After the upgrade, step 1 in the resolution above still need to be implemented.