VMs are intermittently unable to reach default gateway, but can reach other VMs in the same VLAN.
search cancel

VMs are intermittently unable to reach default gateway, but can reach other VMs in the same VLAN.

book

Article ID: 408579

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Some VMs are unable to reach the default gateway on the network.
  • The issue seems intermittent.
  • There are multiple virtual appliances in the environment that use that same MAC address for it's VLAN interface.  For example virtual firewall, virtual load balancer, or virtual router appliances.
  • NIC driver with VMDQ loopback feature in Intel NIC is used(e.g, Intel X710, E810, XXV710)
  • NIC driver for Broadcom network cards is used (e.g., NetXtreme, BCM57412)

Environment

  • VMware vSphere ESXi 7.x
  • VMware vSphere ESXi 8.x

Cause

VMDQ loopback feature used by Intel and Broadcom NICs can cause connectivity issues when duplicate MACs are used in different VLANs.  The MAC address that the VM is trying to reach in the local VLAN may be duplicated in the VMDQ table and the packet will be reflected back to that appliance instead of the intended one.  The packet filter will drop those packets due to VLAN mismatch.

The duplicate MAC entry can be verified by logging into the ESXi host where the affected VM is and issuing the command:
netdbg vswitch mac-table get -dvs <dvsname> | grep -i <gateway's mac address>

Use the MAC address of the intended gateway's interface.  If you see the gateway's MAC address on a port for a VM that is not the intended gateway, you have a duplicate MAC that results in this issue.

Resolution

To turn OFF VMDQ loopback feature on NICs of Hosts where workloads are present as well as Hosts where bridge Edge VMs are present.
 
To turn off VMDQ loopback feature in Intel NIC (on an ESXi 7.0 host) follow the steps below:

1. Install a Intel esxcli plug-in tool by following the Intel esxcli plug-in for managing Intel(r) Ethernet Network Adapters (66772)

2. Run the following command in SSH console.

   # esxcli intnet misc vmdqlb -e 0 -n vmnicX

Note: The above configuration to disable VMDQ loopback feature is not consistent across reboot. To make this setting persistent, please add the command to rc.local by following Modifying the rc.local or local.sh file in ESX/ESXi to execute commands while booting (2043564)
 
Note: For the Intel E810 NIC with the icen driver, the VMDQ loopback feature is only available in version 1.14 and later. Therefore, for this NIC, the only solution is to upgrade the driver to version 1.14 or higher.
 
 
To turn off VMDQ loopback feature on the unified i40en VMware ESX Driver for Intel(R) Ethernet Controllers X710, XL710, XXV710, and X722 family (for a NIC that is on an ESXi 8.0 host):
 

 1. Update the Intel NIC firmware driver to 2.9.2.0

 2. Disable VMDQ on all vmnicX:

   # esxcli intnet misc vmdqlb set -l 0 -n vmnicX

Note:

VMDQ loopback feature is disabled by default with i40en 2.9.2 or later and icen 1.14.2 or later.
Refer to the release notes of the drivers for more details.

Note2:

Inbox driver does not have a feature to disable VMDQ loopback.

 

For Broadcom network cards:
This issue has been observed with driver version 229.0.146.0 and firmware 223.0.205.0 / pkg 22.31.13.70, but not with driver version 232.0.254.0 and its corresponding firmware. For more information on how to download and install the driver, please refer to the KB article: Download and install async drivers in VMware ESXi.  

Additional Information

If SR/IOV is used, VMDQ should not be disabled as it is required for SR/IOV to operate as intended.

If bridging (including HCX extensions) and MAC learning are enabled in the environment, you may have a similar issue.  Please see Intermittent packet loss may occur when bridging is configured on NSX or using HCX Network Extension for details.