All VMs in the vSAN cluster become inaccessible during a network maintenance activity
search cancel

All VMs in the vSAN cluster become inaccessible during a network maintenance activity

book

Article ID: 417289

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • All the Virtual Machines running in the vSAN cluster become inaccessible during the maintenance activity.
  • Once the activity completes, the VMs are automatically restarted by the HA.

  • vSAN skyline health shows a 'hosts with LACP' alert:

Environment

VMware vSAN 7.x

VMware vSAN 8.x

Cause

This is expected behaviour if the uplink vmnics have any issues causing a network partition situation. If one of the vmnic's configured for vSAN traffic is already down and during the maintenance activity the other vmnic also went down, all the hosts can get partitioned and the VMs will become inaccessible.

Below command can be used to check the vmnic's status:

esxcli network nic list

Name    PCI Device    Driver   Admin Status  Link Status  Speed  Duplex  
------  ------------  -------  ------------  -----------  -----  ------ 
vmnic0  0000:5d:00.0  ntg3     Up            Down             0  Half    
vmnic1  0000:5d:00.1  ntg3     Up            Down             0  Half   
vmnic2  0000:86:00.0  qedentv  Up            Up           25000  Full   
vmnic3  0000:86:00.1  qedentv  Up            Down             0  Half   

Resolution

Work with the hardware/network team to identify the issues with the vmnics down and fix it.

Once the issue is fixed, even if there is any network maintenance task, the VMs will not be in an inaccessible state as there will be network redundancy.

Additional Information

Reference articles:

vSAN Health Service - Network Health - vSAN Cluster Partition

vSAN Health Service - vSAN Network Health – Hosts with LACP issues