vDS Health Check reports unsupported VLANs for MTU and VLAN

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

When Network Health Check is enabled on a vSphere Distributed Switch (vDS), you may observe the following behavior:

Alarms are generated with messages about your VLAN and MTU configuration, for example;

vSphere Distributed Switch VLAN trunked status
vSphere Distributed Switch MTU supported status
vSphere Distributed Switch teaming matched status

The monitor page for the vDS Health Check reports that some VLANs are Supported and some are Not Supported for the Physical Network Adapter under either VLAN and MTU.
VLAN 0 shows as Not Supported

The load balancing policy on your vDS may be set to Route Based on IP Hash.

Environment

VMware vCenter Server
VMware vSphere ESXi

Cause

These alerts can trigger for a variety of reasons:

A vDS is connected to multiple uplinks with different VLANs permitted.

If the teaming/failover order is set on individual port groups to control which uplinks are used and each port group has been limited to using VMNICs where that VLAN is permitted, this will result in a working network configuration. However, Health Check does not distinguish between uplinks and will test each VLAN that is enabled for a VDS on each uplink. It will subsequently report a VLAN configured on the vDS as not supported for an uplink if it is not enabled on that particular network adapter. This affects the MTU, VLAN and Teaming & Failover test results.
The MTU configured for your vDS exceeds the MTU set on a switch or router upstream of your host.

The health check feature tests roundtrip connectivity which may transit multiple network devices. An MTU mismatch along the path can be detected even if the switch directly connected to your host has the correct MTU.
The same alarm can also occur if a port group is configured with the VLAN type set to None. In this case, VLAN 0 shows as not supported.
These alarms can trigger due to a combination of the Health Check protocol design and the Route Based on IP hash load balancing algorithm. If the load balancing policy for the vDS switch port is configured as Route based on IP hash and EtherChannel is configured in the connected physical switch, the physical switch may send the unicast frame to another uplink of the host where the broadcast was not sent as a result of the load balancing algorithm. This is not a bug or a design flaw in the health check protocol, load balancing algorithm or switch, but reflects the intended behavior of EtherChannel.
You may receive alerts after restoring a distributed switch configuration, where the uplinks created initially are included in the vDS restore referencing previously allocated VLAN trunks that may no longer be in use.

If this is the case, the original vDS uplinks can be removed:

Resolution

This is the expected behavior with the Health Check feature in vSphere. The alarms inform you about configuration issues that you should be aware of.

To prevent the vSphere Distributed Switch Health Check from showing these reports, apply one these options:

Disable health check on the vDS. For more information, see Enabling vSphere Distributed Switch health check in the vSphere Web Client (321305).
Use multiple vSphere Distributed Switches, so that every uplink connected to a particular vDS supports the VLANs of all port groups configured on it.
Allow every VLAN used by a port group on the vDS on every uplink, even if the Teaming/Failover order would prevent the port group from using that uplink.
Ensure that all network devices handling traffic originating from the host have an MTU that is equal or greater to the MTU set on your vDS

To prevent alerts from occurring where a port group with no VLAN assigned (VLAN is set to None) is in use, apply one of these options:

Assign the port group to a VLAN that is configured on the physical network for communication between all ESXi hosts connected to the vDS.
Move the port group to a Virtual Standard Switch (VSS) instead of a vDS, which is not affected by the Health Check.

Additional Information

Limitations of Network Health Check

The distributed switch network health check for vSphere does not diagnose the end-to-end full path problem. Using the echo type L2 protocol, the health check only checks the health status of ports to which the distributed switch connects. So, the check reports good health status only if two or more good setting peers (uplinks) appear in the same L2 networking.
The physical switch VLAN does not recognize the virtual networking in ESXi. If the physical switch is misconfigured, ESXi does not report warnings, resulting in networking failures until the health check feature is enabled and the new round check completes.
The distributed switch network MTU health check is designed to probe the runtime true Jumbo Frame capability of ports to which the distributed switch connects. However, the maximum VLAN MTU size determines the physical switch trunk port MTU size setting in all trunk VLANs for the port. The MTU health check feature "Supported/Not supported" status result displays whether or not the access port supports the distributed switch MTU setting. The "VLAN Trunk" status result field displays all the distributed port groups VLAN setting range in that physical switch trunk port.
The distributed switch network health check, including the VLAN, MTU, and teaming policy check may not function properly when there are hardware virtual NICs on the server platform.
In vSphere, the teaming health check does not work for LAG ports as the LACP protocol itself is capable of ensuring the health of the individual LAG ports. However, VLAN and MTU health check can still check LAG ports.
Ensure that all portgroups in the virtual distributed switch with different VLANs have the same MTU in the physical switch because ESXi will not detect the MTU mismatch of full paths and Jumbo Frame packets might forwarded to other physical switch ports which are out of the virtual distributed switch. At those ports, there is a risk that the Jumbo Frame packets might be dropped if that port and VLAN do not enable Jumbo Frames.

Note: Depending on the options that are selected, the vSphere Distributed Switch Health Check can generate a significant number of MAC addresses for testing teaming policy, MTU size, VLAN configuration, resulting in extra network traffic The distributed switch network health check generates one MAC address for each uplink on a distributed switch for each VLAN multiplied by the number of hosts in the distributed switch to be added to the upstream physical switch MAC table. For example, for a vDS having 2 uplinks, with 35 VLANs across 60 hosts, the calculation is 2 * 35 * 60 = 4200 MAC table entries on the upstream physical switch. Ensure the number of MAC addresses to be generated by the health check will be less than the size of the physical switch(es) MAC table maximums. Otherwise, there is a risk that the switches runs out of memory, with subsequent network connectivity failures.

After disabling the vSphere Distributed Switch Health Check, the generated MAC addresses age out of the physical network environment according the network policy.

Impact/Risks:
There is no data path impact. However, the Health Check does not fully monitor the status of the vDS when one or more of the above configuration is used.