vSAN Health Service -- Host Maintenance Mode is in sync with vSAN Node Decommission State
search cancel

vSAN Health Service -- Host Maintenance Mode is in sync with vSAN Node Decommission State

book

Article ID: 318411

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This article explains the health check whether the host's maintenance mode is in sync with vSAN node decommission state.

Environment

VMware vSAN (All Versions)

Resolution

What does the check - Host Maintenance Mode is in sync with vSAN Node Decommission State - do?

Once the vSAN host is in vSAN decommission state, it doesn't accept any new object creation, and all components on the host will be absent until it is back. When the host in the vSAN cluster is in maintenance mode, it is expected that this host is in vSAN decommission state. When the host exits maintenance mode, the host should not be in vSAN decommission state. This health check verifies the 2 states are in sync.

What does it mean when it is in an error state?

If a host is in maintenance mode using any of the vSAN options but is not in vSAN decommission state, or a host exits maintenance mode but is still in vSAN decommission state, the check triggers a warning message.


 

What action can be performed to resolve this?

If a host is decommissioned but the host is not in maintenance mode, do the following:
  1. Identify the relevant host as per the vSAN Health Service alarm. Make sure to use the host where "vSAN Node in Decommission State" is "Yes".
  2. Place the same host into maintenance mode with No Action by doing the following:
    Right-click the host > select Maintenance Mode > No Action
  3. Then exit the maintenance mode right after.
This will reset the vSAN decommission status and make it consistent with the vSphere maintenance mode.
 
In some cases, this can occur to multiple vSAN hosts at the same time. This will cause vCenter if it resides on the vSAN datastore, and potentially other VMs to become inaccessible. In that case you can perform the same steps via command line.
 
  1. Run the below script to check for any hosts in the cluster that may be in vSAN Decom:
    echo "hostname,decomState,decomJobType";for host in $(cmmds-tool find -t HOSTNAME -f json |grep -B2 Healthy|grep uuid|awk -F \" '{print $4}');do hostName=$(cmmds-tool find -t HOSTNAME -f json -u $host|grep content|awk -F \" '{print $6}');decomInfo=$(cmmds-tool find -t NODE_DECOM_STATE -f json -u $host |grep content|awk '{print $3 $5}'|sed 's/,$//');echo "$hostName,$decomInfo";done|sort

    Example output:
    hostname,  decomState,  decomJobType
    host-21,0,0
    host-22,0,0
    host-23,6,0
    host-24,0,0
    host-25,0,0

    decomState 6 means the host is in vSAN Decom


  2. To check if the host is in maintenance mode run: esxcli system maintenanceMode get
  3. If you have similar output, the vSAN is in decommissioned state, but the ESXi host is NOT in maintenance mode.
  4. First, check if there are running virtual machines on the host: esxcli vm process list. If there are any virtual machines powered on, they will need to be powered off or the host will not enter maintenance mode.
  5. You can use the commands to attempt to power them off safely: esxcli vm process kill -w <world_id> -t soft or vim-cmd vmsvc/power.shutdown <vm id>
  6. Then run this command to place the ESXi host in maintenance mode with the vSAN option for no data migration: esxcli system maintenanceMode set -e true -m noAction
  7. Once the prompt returns, the host should be in maintenance mode then you can run this command: esxcli system maintenanceMode set -e false
  8. Perform these steps on any other hosts where vSAN is decommissioned but the host is not in maintenance mode.
    Note: You can also use the Host UI to check for any powered on VMs and to place the host into maintenance mode with No Migration and then take the host back out of maintenance mode

Additional Information