Virtual Machines Down with Unknown/Invalid State after maintenance mode task failures.
search cancel

Virtual Machines Down with Unknown/Invalid State after maintenance mode task failures.

book

Article ID: 393502

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms :

  • Multiple objects and virtual machines inaccessible after recent maintenance mode task failures. 
  • The following examples show objects in an inaccessible state 

        [root@e1:~] esxcli vsan debug object health summary get
        Health Status                                              Number Of Objects
        ---------------------------------------------------------  -----------------
        inaccessible                                                              34

If multiple objects lose access to multiple components, then objects will become inaccessible. This can also cause objects to go into reduced availability and reduce storge space availability. 

Environment

VMware vSAN 7.x

VMware vSAN 8.x

VMware vSAN 9.x

Cause

  • When reviewing the vSAN cluster status across various hosts, you may encounter hosts in a DECOMMISSIONED state (vSAN maintenance mode). This can lead to data unavailability and result in issues when powering on virtual machines.

When running the command for i in $(cmmds-tool find -f json -t HOSTNAME | grep -iE "uuid" | awk '{print $2}'| sed "s/\"//g" |sed "s/\,//g"); do cmmds-tool find -u $i -f json | grep -E "decom|hostname" | awk '{print $2,$3}' ;done | sed 'N;s/\n/ /' You see some hosts decomstate showing a different value other than 0.

{"decomState": 0, {"hostname": "Host 1"},
{"decomState": 0, {"hostname": "Host 2"},
{"decomState": 6, {"hostname": "Host 3"},

  • This can also be be verified on each host using the command below

    [root@e3:~] esxcli vsan cluster get
   Cluster Information
   Enabled: true
   Current Local Time: 2025-04-08T00:56:44Z
   Local Node UUID: 60374a##-5cce-95##-11##-############
   Local Node Type: NORMAL
   Local Node State: BACKUP
   Local Node Health State: HEALTHY
   Sub-Cluster Master UUID: 60374c##-9c##37##-bd##############
   Sub-Cluster Backup UUID: 60374a##-5c##-95##-11##-############
   Sub-Cluster UUID: 52fca5##-5f##a6##-a8##-############
   Sub-Cluster Membership Entry Revision: 3
   Sub-Cluster Member Count: 4
   Sub-Cluster Member UUIDs: 60374c##-9c##-37##-bd##-############, 60374###-5c##-95##-1140-############, 603749##-84##-dc##-27##-############, 60374a##-7d##-7a##-fb4a-############
   Sub-Cluster Member HostNames: e2, e3, e1 ,e4
   Sub-Cluster Membership UUID: 8df9f3##-30##-b29##-8d##-############
   Unicast Mode Enabled: true
   Maintenance Mode State: ON

  • In the example above, the host maintenance mode state is on, which indicates this host isn't contributing as the active components on this host is unavailable. 

Resolution

  • Fix the maintenance mode issues with the nodes.
  • To resolve the maintenance mode issue on the node which shows Maintenance Mode State: ON , just simply put the host into maintenance mode with no action and then take it out of maintenance mode. This will both exit the host from regular maintenance mode and vSAN maintenance mode. 

Additional Information

  • Please see below KB as a reference 

         vSAN objects unhealthy due to DECOM state