Troubleshooting VLAN Connectivity Issues
search cancel

Troubleshooting VLAN Connectivity Issues

book

Article ID: 375097

calendar_today

Updated On: 01-06-2025

Products

VMware vSphere ESXi

Issue/Introduction

Pinging between VLANs or capturing traffic between VLANs shows packet loss, even though the physical NICs (vmnics) are reported as being up (from the physical switch and the ESXi hosts). 

Common examples:

An ESXi management vmkernel adapter (vmk) is unable to communicate over a specific VLAN.

  • i.e. vMotion is failing between ESXi hosts

VM connectivity is not working between VLANs, even though the VM network adapter (vNIC) shows no indication that it is disconnected. 

  • i.e. VM_A which uses VLAN 245 cannot ping VM_B which uses VLAN 500

Cause

There can be a few different causes for VLAN connectivity issues:

  • Incorrect configurations of the virtual portgroups and physical switch ports can cause connectivity issues between VLANs.  
  • Data path issues (i.e. the physical network uses multiple physical switches or fabric interconnect between the virtual and physical network)
  • Inconsistent VLAN ID assignments across the physical and virtual network

Please note the list above does not account for all possibilities.

Resolution

Configuration Verification

In order for VLAN traffic to flow smoothly between the virtual network and the physical network, the configurations between the two need to be compatible. The configuration for the virtual network will depend on how the physical switch ports are set up. 

NOTE: Certain environment configurations may require a more in-depth set up (NSX, Virtual Guest Tagging, etc.). Please refer to VLAN configuration on virtual switches, physical switches, and virtual machines and Create an NSX Segment for more detailed information.

Common Examples: 

  • Physical switch ports are configured to be trunked for VLANs --> The ESXi virtual portgroups (on either a Virtual Distributed Switch (vDS) or Standard Switch (vSS)) have the VLAN added to them.
    • In this instance, the ESXi host will send the packets from these portgroups out as tagged with the VLAN configured.

      • Example: The traffic going out of VM_A and being sent out to the physical network will have the VLAN tag 220 applied.


  • Physical switch ports are configured to be access or native for VLANs --> The ESXi virtual portgroups (on either a Virtual Distributed Switch (vDS) or Standard Switch (vSS)) do not have the VLAN added to them.
    • In this instance, the physical switch is performing the tagging for the packets coming from/to the ESXi host.
      • Example: The traffic going out of VM_A and being sent to the physical network will not have a VLAN tag applied until it reaches the physical switch where the VLAN tag is now applied.

If the these packets are being dropped, there may be a compatibility issue between the physical switch port configuration and the virtual portgroup configuration. Below are some common scenarios:

  •  Physical switch ports are configured to be trunked for VLANs --> The ESXi portgroups do not have the VLAN tag and therefore the ESXi host is not sending out the packets with the expected VLAN tag --> the physical switch will drop these packets because it is expecting the packets to be received with the VLAN tag applied.

  • Physical switch ports are configured to be access or native for VLANs --> The ESXi virtual portgroups does have the VLAN added to them and therefore the ESXi host is sending out the packets with the VLAN tag applied --> the physical switch will drop these packets because it is not expecting to receive packets that already have the VLAN tag applied. 

If the configuration is deemed to be the cause of the disconnect, please implement the supported configuration and ensure testing shows successfully connectivity. 

If the configurations for the VLAN tagging are correct and should be compatible between the physical and virtual network proceed with the troubleshooting below:

Viewing VLAN stats:

To view if any VLAN traffic has occurred on a specific vmnic the process below can be followed: 

  1. Determine which vmnic is currently being used for communication either by the vmkernel or the VM.
    1. Open a putty session to the ESXi host that houses the VM and log in with the Root credentials.
    2. Start esxtop using the command:
      • esxtop
    3. Change to the networking view by pressing 'n'
    4. From this screen, identify the vmnic is currently aligned with either the VM (blue box in the example screenshot below) or vmkernel adapter (green box in the example screenshot below) that is experiencing the issue:
    5. After the vmnic is noted, press 'q' to quit of the esxtop screen 
  2. Review the VLAN traffic on the vmnic 
    1. To see which VLANs have passed through the vmnic the stats need to be enabled. Run the command:

      esxcli network nic vlan stats set -e true -n vmnic#
       
      1. Note : The VLAN stats are designed for use as a troubleshooting tool and should not be left running full time.  It is important to disable the stats after troubleshooting has completed by running the command in step 4.
    2. Wait about 10 seconds and then run the following command:

      esxcli network nic vlan stats get -n vmnic#

  3. The output of the command will show any VLAN specific traffic that has used that vmnic since the stats were enabled. This is all VLAN traffic that has utilized this specific vmnic which includes inbound, outbound, packets with the VLAN tagged by the ESXi host or physical switch etc. The output can look similar to the example below:

    [root@host_name_here:~] esxcli network nic vlan stats get -n vmnic0
    VLAN 223
           Packets received: 12759
           Packets sent: 708
    [root@host_name_here:~]

    • It is important to note there may be multiple VLANs listed after running the command above and this is expected if multiple VLANs have used the specified vmnic after enabling the VLAN stats. The results of the command provide a general idea of which VLANs are able to use the vmnic.
      • Example: If the ESXi host sends out a packet that is tagged for VLAN 223, but no packets were received with the VLAN tag 223 the output would reflect this:
        VLAN 223
              Packets received: 0
              Packets sent: 1
  4. After reviewing the VLAN traffic seen on the vmnic the stats will need to be disabled by running the command:

    esxcli network nic vlan stats set -e false -n vmnic#

  5. All VLANs should show receive traffic from upstream if the connectivity/configurations are working as expected.  If no receive traffic is reported, that indicates the upstream switch port is not configured to allow that particular VLAN. 

 

Additional Information

If packet captures are necessary to analyze the packets please refer to Packet capture on ESXi using the pktcap-uw tool.