Impact/Risks: Any host stuck in DecomState: 6 will not be contributing storage or compute resources to the cluster because vSAN believes the host to still be in Maintenance Mode.
Symptoms:
Virtual Machines become inaccessible or orphaned in vSphere. All the drives in all hosts are mounted in cmmds, and none of the hosts are network partitioned. In vSphere, no hosts appear to be in Maintenance Mode. When you check every host from the command line with: # esxcli system maintenanceMode get … they all show "Disabled" But if you run the command "# localcli vsan cluster get" you'll see Maintenance Mode Enabled or "ON" (example below)
Cluster Information: Enabled: true Current Local Time: 2018-09-14T18:51:57Z Local Node UUID: ########-####-####-####-########a270 Local Node Type: NORMAL Local Node State: AGENT Local Node Health State: HEALTHY Sub-Cluster Master UUID: ########-####-####-####-########a4dcool…0 Sub-Cluster Backup UUID: ########-####-####-####-########a390 Sub-Cluster UUID: ########-####-####-####-########f8c7 Sub-Cluster Membership Entry Revision: 3 Sub-Cluster Member Count: 4 Sub-Cluster Member UUIDs: ########-####-####-####-########a390, ########-####-####-####-########a4d0, ########-####-####-####-########c7b0, ########-####-####-####-########a270 Sub-Cluster Membership UUID: ########-####-####-####-########a4d0 Unicast Mode Enabled: true Maintenance Mode State: ON <<---This node is in Decom state according to vSAN. Config Generation: ########-####-####-####-########112e 5 2018-09-14T17:21:34.629
This means that the Host is in vSAN Decom State - maintenance mode didn't cancel or exit cleanly, so vSAN considers the host still in maintenance mode.
Environment
VMware vSAN (All Versions)
Cause
This issue is frequently the result of issuing a maintenance mode task in vCenter, quickly followed by a cancellation of the maintenance mode task. The host may end up stuck in vSAN Decom State: 6 where vSAN considers the host to still be in Maintenance Mode.
Resolution
Use the following command to see which Hosts are in this state (indicated by a value of "decomState": 6)
To clear this state do the following: 1) Place the affected host into maintenance mode with No Action or No Data Migration depending on the version 2) Remove the host from maintenance mode 3) Verify Object health 4) Sometimes the host will not exist the DECOM_STATE ever after reboot .In this situation shutdown the host completely ( check the dependency by debug precheck command beforehand ex. esxcli vsan debug evacuation precheck -e " HOSTNAME/UUID" ) . Power off the host wait for 20 seconds and Power On again