vSAN -- During or after vSAN Hosts are patched/rebooted, vCenter and other Production VMs are not accessible
search cancel

vSAN -- During or after vSAN Hosts are patched/rebooted, vCenter and other Production VMs are not accessible

book

Article ID: 415211

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

While rebooting one or more vSAN Hosts of the same vSAN Cluster you experience one or more of the following symptoms:

  • Virtual Machine(s) cannot be accessed via e.g. Remote Desktop (RDP)
  • Virtual Machine(s) Operating System is reporting I/O errors (= inside the VM)
  • Virtual Machine(s) are showing as inaccessible in in the vSphere Client.

Example: 

 
 
 
 
  •  Virtual Machine(s) do show as invalid in the Host Client 

 



Environment

vSphere ESXi - All Versions

Cause

More vSAN Hosts are unavailable to the Cluster than the configured Data Redundancy allows.
As a result, the vSAN Data of one or more VMs becomes unavailable (= inaccessible) resulting in the VM(s) not being available (= inaccessible) and/or reporting errors.
 
A vSAN Host is unavailable to the Cluster when it is offline, in Maintenance Mode or not able to communicate to the other vSAN Hosts via Network.
 
 
 
 

Resolution

Data Redundancy is configured via the following settings available in the Storage Policy assigned to affected VM(s):
 
Depending on the configured setting, ensure that the maximum number of unavailable vSAN Hosts is not higher than defined below:
 
FTT=0: None of the Hosts can be unavailable without being put in Maintenance Mode. Only one Host can be in Maintenance Mode with option "Ensure Accessibility" or "Full Data Migration" at a certain time. 
FTT=1: Only one Host can be unavailable to the Cluster at a certain time
FTT=2: Only two Hosts can be unavailable to the Cluster at a certain time
FTT=3: Only three Hosts can be unavailable to the Cluster at a certain time
 
Example Screenshot from editing or creating a Storage Policy:
 
 
 
 
 
 
It is not recommended to use Site disaster tolerance: None (standard cluster)" since this setting can increase the risk of Data unavailability during outages or Maintenance activities.
It is recommended to use "Site disaster tolerance: Site mirroring - stretched cluster" to ensure Data availability persists in the event an entire site goes down.
 
Example Screenshot from editing or creating a Storage Policy:
 
 

 

 

Additional Information