VMDQ loopback feature used by Intel and Broadcom NICs can cause connectivity issues when duplicate MACs are used in different VLANs. The MAC address that the VM is trying to reach in the local VLAN may be duplicated in the VMDQ table and the packet will be reflected back to that appliance instead of the intended one. The packet filter will drop those packets due to VLAN mismatch.
The duplicate MAC entry can be verified by logging into the ESXi host where the affected VM is and issuing the command:netdbg vswitch mac-table get -dvs <dvsname> | grep -i <gateway's mac address>
Use the MAC address of the intended gateway's interface. If you see the gateway's MAC address on a port for a VM that is not the intended gateway, you have a duplicate MAC that results in this issue.
1. Update the Intel NIC firmware driver to 2.9.2.0
2. Disable VMDQ on all vmnicX:
# esxcli intnet misc vmdqlb set -l 0 -n vmnicX
Note:
VMDQ loopback feature is disabled by default with i40en 2.9.2 or later and icen 1.14.2 or later.
Refer to the release notes of the drivers for more details.
Note2:
Inbox driver does not have a feature to disable VMDQ loopback.
For Broadcom network cards:
This issue has been observed with driver version 229.0.146.0 and firmware 223.0.205.0 / pkg 22.31.13.70, but not with driver version 232.0.254.0 and its corresponding firmware. For more information on how to download and install the driver, please refer to the KB article: Download and install async drivers in VMware ESXi.
If SR/IOV is used, VMDQ should not be disabled as it is required for SR/IOV to operate as intended.
If bridging (including HCX extensions) and MAC learning are enabled in the environment, you may have a similar issue. Please see Intermittent packet loss may occur when bridging is configured on NSX or using HCX Network Extension for details.