Multiple Alerts in Skyline Health Check Due to ESXi Host Network Partition
search cancel

Multiple Alerts in Skyline Health Check Due to ESXi Host Network Partition

book

Article ID: 414083

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms : 

When an ESXi host enters a partitioned state, vSAN loses connectivity between nodes, triggering multiple alerts in Skyline Health. These alerts indicate that the cluster is experiencing communication issues that may affect object availability and VM accessibility.

Common alerts observed in Skyline Health include:

  • vSAN cluster partition — indicates that one or more hosts are isolated from the cluster

  • vSAN: Basic (unicast) connectivity — reports failure in unicast traffic between hosts

  • vSAN: MTU check (ping with large packet) — flags mismatched or insufficient MTU settings across the vSAN network

Additionally, vSAN object health may report errors such as:

  • Reduced availability with no rebuild

  • Inaccessible

 



  • esxcli vasn cluster get command will show  cluster is partitioned, with the affected node not able to communicate via ping with other data nodes.
    NODE# esxcli vsan cluster get
    Cluster Information
       Enabled: true
       Current Local Time: 2019-09-03T07:02:40Z
       Local Node UUID: ########-####-####-####-########7c0e
       Local Node Type: NORMAL
       Local Node State: MASTER
       Local Node Health State: HEALTHY
       Sub-Cluster Master UUID: ########-####-####-####-########7c0e
       Sub-Cluster Backup UUID:
       Sub-Cluster UUID: ########-####-####-####-########2f67
       Sub-Cluster Membership Entry Revision: 0
       Sub-Cluster Member Count: 1
       Sub-Cluster Member UUIDs: ########-####-####-####-########7c0e
       Sub-Cluster Member HostNames: NODE2
       Sub-Cluster Membership UUID: ########-####-####-####-########7c0e
       Unicast Mode Enabled: true
       Maintenance Mode State: OFF
       Config Generation: ########-####-####-####-########9e01 12 2019-08-19T09:12:12.1

    NODE#vmkping -I vmk2 192.168.x.xxx -c 100 -i 0.005
    PING 192.168.x.xxx (192.168.x.xxx): 56 data bytes

    --- 192.168.x.xxx ping statistics ---
    100 packets transmitted, 0 packets received, 100% packet loss
     
  • When multiple hosts in a vSAN cluster experience failure or become unavailable, virtual machines  may become inaccessible or unreachable depending on the extent of the impact. This is typically due to the loss of quorum or the inability to access required components of the VM's storage objects.

    As a result, the health status of affected vSAN objects may transition from Healthy to one of the following states:

    • Reduced availability with no rebuild: Indicates that the object has lost redundancy and cannot initiate a rebuild due to insufficient resources or host availability.

    • Inaccessible: Signifies that the object is completely unavailable, and the VM depending on it may be powered off or unresponsive.

    These health states reflect a failure condition where vSAN is unable to reconstruct or repair the impacted components, potentially affecting VM availability

    .

    vSAN object health can be validated from ESXi CLI of any host in the cluster using the below command.

    # localcli vsan debug object health summary get

Note : This command provides a summary of object health, including availability status and failure details, which can help in diagnosing and resolving cluster-level issues.

 

 

Environment

VMware vSAN 6.x
VMware vSAN 7.x
VMware vSAN 8.x

Cause

A network partition on one or more ESXi hosts can disrupt vSAN communication, leading to a loss of quorum. When quorum is lost, vSAN cannot access or rebuild affected objects, causing virtual machines and data to become inaccessible. This triggers multiple alerts in vSAN Skyline Health, such as “Cluster Partition” or “Reduced Availability with No Rebuild.

Resolution

It is recommended to verify the physical network connectivity, including switch port configurations, VLANs, NIC teaming settings, Physical NIC and MTC.