VMs on vSAN show as "Invalid" after power outage or completion of maintenance task
search cancel

VMs on vSAN show as "Invalid" after power outage or completion of maintenance task

book

Article ID: 394799

calendar_today

Updated On:

Products

VMware vSAN VMware vSAN 7.x VMware vSAN 8.x

Issue/Introduction

After ESXi hosts have been brought back up after a power outage, or from a maintenance task, VMs still show as unknown or invalid in client view:

 

vSAN Datastore view may not show any VMs:

 

 
Note: The vCenter VM might be located on the affected vSAN Datastore.
- If it is, it may not be accessible.

 

Characteristics of the issue:

  • The vSAN production VMs are down after a maintenance activity, which might have included a vSAN cluster shutdown and restart.
  • VMs on the vSAN cluster are displayed as "invalid" or "unknown" in the ESXi host client.
  • Inaccessible objects are observed in esxcli vsan debug object health summary get command output.
  • One or more of the vSAN nodes has "Maintenance Mode State: On" in esxcli vsan cluster get command output.
  • vSAN might show multiple files missing.

 

Environment

VMware vSAN (All Versions)

Cause

One or more vSAN Hosts are in vSAN Decom State See KB vSAN Host Maintenance Mode is not in sync with vSAN Node Decommission State
 
In order to check, open SSH/Putty Session to all Hosts and run:  
 
esxcli vsan cluster get

You can also run the below script from one of the hosts in the cluster to see which hosts in the cluster are in vSAN Decom state.
echo "hostname,decomState,decomJobType";for host in $(cmmds-tool find -t HOSTNAME -f json |grep -B2 Healthy|grep uuid|awk -F \" '{print $4}');do hostName=$(cmmds-tool find -t HOSTNAME -f json -u $host|grep content|awk -F \" '{print $6}');decomInfo=$(cmmds-tool find -t NODE_DECOM_STATE -f json -u $host |grep content|awk '{print $3 $5}'|sed 's/,$//');echo "$hostName,$decomInfo";done|sort

Sample output:
hostname,decomState,decomJobType
esxi-1.example.com,0,0
esxi-2.example.com,0,0
esxi-3.example.com,0,0
Anything other than 0 means there is a host in vSAN Decom State.
hostname,decomState,decomJobType
esxi-1.example.com,0,0
esxi-2.example.com,0,0
esxi-3.example.com,6,0 <---

Resolution

  1. Place any of the vSAN Hosts found in vSAN Decom into Maintenance Mode with "No Action" or "No Data Migration" (= depending on Version).
  2. Once the task completes, take the host out of Maintenance Mode.
  3. Repeat this process for any other hosts in the cluster that are still in vSAN Decom state
  4. Re-evaluate whether the reported issues described via section Issue/Introduction still exist

If assistance is required, please open a Ticket with VMware by Broadcom Support.

For more information, please refer to: A node is in vSAN Maintenance Mode, even though the UI shows the ESXi host is not.