vSAN skyline health reports errors:vSAN: MTU check (ping with large packet size)
search cancel

vSAN skyline health reports errors:vSAN: MTU check (ping with large packet size)

book

Article ID: 391812

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • vSAN Skyline health may report below errors.

     

    a) vSAN: MTU check (ping with large packet size)

     

    b) Cluster partition

     

    c) Object health

     

    d) Performance issue on vSAN data-store due to MTU mismatch. 

     

    e) Can not browse Data-store or register VMs. 

     

Validation Step:

  • To validate the faulty VMK and host, click on the Troubleshoot option for the vSAN: MTU Check (Ping with Large Packet Size) error.


  • Based on the screenshot below, it is confirmed that Host-01 is unable to communicate with the other two hosts in the cluster via VMK1 with an MTU of 9000.

Environment

VMware vSAN 7.x
VMware vSAN 8.x

Cause

The hosts are unable to communicate with each other over vSAN traffic due to an MTU mismatch.

Cause validation:

Run the command "esxcli vsan network list" to identify the VMK used for vSAN traffic.
esxcli vsan network list
Interface
VmkNic Name: vmk1
IP Protocol: IP
Interface UUID: yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyy
Agent Group Multicast Address: xxx.x.x.x
Agent Group IPv6 Multicast Address: xxxx: :x:x:x
Agent Group Multicast Port: zzzzz
Master Group Multicast Address: xxx.x.x.x
Master Group IPv6 Multicast Address: xxxx: :x:x:x
Master Group Multicast Port: zzzzz
Host Unicast Channel Bound Port: zzzzz
Data-in-Transit Encryption Key Exchange Port: 0
Multicast TTL: 5
Traffic Type: vsan

In the above example, it is confirmed that vmk1 is used for vSAN traffic.

  • Run the command "esxcfg-vswitch -l" to identify the vSwitch used for vSAN traffic and check the MTU configured on it.
    esxcfg-vswitch -l

    DVS Name                   Num Ports    Used Ports      Configured Ports MTU      MTU
    Switch name                  2520            10              512                 9000

    DVPort ID                                                 In Use                 Client
    512                                                         1                    vmnicl         
    513                                                         1                    vmnic0
    514                                                         0                 
    515                                                         0                    
    0                                                           1                    vmk0
    128                                                         1                    vmk1
    256                                                         1                    vmk2


    In the above example, the vSAN VMkernel adapter  (vmk1) is associated with the specified vSphere Distributed Switch (vDS). Therefore, it is confirmed that vmnic1 and vmnic0 are being used for vSAN communication and the vSwitch is configured with an MTU of 9000.

  • Run the command "esxcfg-vmknic -l" to verify the MTU set on the VMkernel adapter (vmk)
    esxcfg-vmknic -l
    vmk1               128                            IPv4                                                     9000
    65535              true   STATIC                 DefaultTCPIPStack

    In the above example, it is confirmed that vmk1 is configured with MTU 9000. 
  • .Run the command "esxcfg-nics -l" to confirm the MTU configured on the physical nics (vmnics).
    esxcfg-nics -l
    Name             PCI                                 Driver    Link Speed     Duplex     MAC Address                   MTU     Description
    vmnico       xxxx: xx: xx: x vmxnet        Up         10000Mbps     Full          xx:xx:xx:xx:xx:xx:xxxx              9000
    vmnicl        xxxx: xx: xx: x vmxnet       Up          10000Mbps    Full          xx:xx:xx:xx:xx:xx:xxxx              9000

    In the above example, it is confirmed that vmnics are configured with MTU 9000.

    Note: Repeat the above procedure for all the hosts in the cluster and make sure the MTU should be consistent across the network. It's important to ensure that the MTU setting is consistent across the entire environment — including the vSphere VMkernel interfaces, VMNICs, and the physical switch ports. In some cases, MTU mismatches can occur even within vSphere itself, between the VMkernel and the VMNICs..

  • Ping the faulty host from a working host using a 9000 MTU is not working.
    vmkping -I vmkx -d -s  8972 <IP adress of faulty node>"
    PING xx.xx.xxx.xx ( xx.xx.xxx.xx): 8972 data bytes

    ---  xx.xx.xxx.xx ping statistics ---
    3 packets transmitted, 0 packets received, 100% packet loss

  • Ping the faulty host from a working host using a 1500 MTU is working fine.
    vmkping -I vmkx -d -s  1472 <IP adress of faulty node>"

    PING xx.xx.xxx.xx (xx.xx.xxx.xx): 1472 data bytes
    1480 bytes from xx.xx.xxx.xx: icmp_seq=0 ttl=64 time=0.118 ms
    1480 bytes from xx.xx.xxx.xx: icmp_seq=1 ttl=64 time=0.116 ms
    1480 bytes from xx.xx.xxx.xx: icmp_seq=2 ttl=64 time=0.106 ms

    --- xx.xx.xxx.xx ping statistics ---
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max = 0.106/0.113/0.118 ms

    Based on the above results, it shows that vSAN communication fails between the healthy and faulty hosts with MTU 9000, but works with MTU 1500. This points to an MTU mismatch somewhere in the environment.

    The VMkernel adapter on the Host has the MTU of 9000, but the physical switch enforces the MTU of 1500. 
    The failure occurs because the source does not fragment the packet and the physical switch drops the packet.

Resolution

  • If there is an MTU mismatch within vSphere between the VMkernel and the vmnics, please change the MTU value. 

If the correct MTU for the environment is 1500 and the VMK is set 9000 change it to 1500 to allow for cluster creation. 

To configure Jumbo Frames on a vDS in vSphere Web Client:

      1. Browse to a distributed switch in the vSphere Web Client navigator.
      2. Click the Actions tab, and click Settings > Edit Settings.
      3. Click Advanced and set the MTU property to a value greater than 1500 bytes.
        • You cannot set the MTU size to a value greater than 9000 bytes.
        • When changing the MTU size in a vDS, the attached uplinks (physical NICs) are brought down and up again. This causes a short network outage for virtual machines and/or services that are using the uplinks.
      4. Click OK.

 

To configure Jumbo Frames on a vSS in vSphere Web Client:

   1. In the vSphere Web Client, navigate to the host.
   2. On the Configure tab, click Virtual Switches.
   3. Navigate to the virtual switch, then click Edit.
   4. Set the MTU value to 9000.

 

To enable Jumbo Frames on a VMkernel port using the vSphere Web Client in vCenter Server:

   1. In the vSphere Web Client, navigate to the host.
   2. On the Configure tab, click VMkernel Adapters.
   3. Click Edit.
   4. Set the MTU value to 9000.

 

    Note: You can increase the MTU size up to 9000 bytes.

Find the KB article to set MTU for vmkernel and vmnic within vSphere.

  • If the MTU settings within vSphere are correct, please engage the switch or network vendor to verify the MTU configuration on all external network components and ensure they are set to correct value.