Preparation steps:
- Prior to any changes, be sure to take note of the host configuration prior (such as IP addresses, vLANs and the Subnet for vSAN, vMotion, and Management VMKernel adapters.
- Note which drives are in use for vSAN Disk Groups or Storage Pool. These should not need to be touched, but care should be taken not to select one of these drives for the ESXi install.
- The host should be in maintenance mode. Use 'ensure accessibility' and you might extend the delay time to avoid a resync. If that can't be done, then you can try 'full data evacuation' MM operation, but that is a time-consuming operation.
- The vSAN objects should not be in an inaccessible state nor reduced availability nor other failure states. If they are, correct this before starting.
- If encryption is used, the KMS must be online and accessible in order to decrypt the vSAN drives after reinstall. A reboot of the host will be required and if it cannot obtain the key encryption key from the KMS, it will not be able to decrypt the disk groups.
- The host should not be in the vSAN cluster at time of re-install, as it will need to be re-added from datacenter folder after re-image and reconfiguration. This will ensure it pushes the vSAN config to the existing vSAN cluster.
- If the host had a custom certificates with vCenter in Custom Mode (meaning the ESXi Machine Cert was replaced), you will have to temporarily set it to 'VMCA' mode to allow host into inventory. After this, you can replace the ESXi certificate and change the certificate mode back to custom. Change the ESXi Certificate Mode , Configuring CA signed certificates for ESXi hosts
Note: If reinstalling ESXi due to failed boot media or OS corruption then some of the above preparation steps won't apply. Key thing is to ensure you don't incorrectly choose one of the vSAN disks for ESXi install and know the previous config of the host.
To re-image and rejoin the ESXi host to the vSAN cluster:
- Remove the old ESXi host entry from vCenter
- The host needs to be dis-associated from the VDS if they were a part of one. If it is properly disconnected then the VC will remove it from the VDS.
- Validate the ESXi is in a disconnected state. If not disconnected, Right-click the host you want to Disconnect in the inventory pane, and select disconnect from the pop-up menu.
- Right click on the ESX host you wish to remove from the Inventory and select "remove from inventory"
- Install the exact ESXi version (matching the build number) as the other remaining hosts in the cluster, ensuring that you preserve the vSAN disk partitions. Build numbers and versions of VMware ESXi/ESX
- Add the host to the Datacenter Object in vCenter inventory. Creating a datacenter and adding an ESXi host to the vCenter Server Inventory using vCenter Server appliance.
- The initial host management network may need to be configured from the DCUI. Configuring VMware ESXi Management Network from the direct console Configure the vSAN VMkernel port group on the host and verify with vmkping that it can ping and be pinged on the vSAN VMkernel port. For more information, see How to configure vSAN VMkernel networking. Testing VMkernel network connectivity with the vmkping command
- Drag the host in to the vSAN cluster object. This will enable vSAN clustering on the host and trigger a vSAN cluster update pushing the correct unicast table list to all hosts.
- Validate cluster health via the vSAN skyline health check
If the process above does not correct the issue please follow the steps below.
- Connect to one of the remaining vSAN cluster hosts using SSH.
- Identify the vSAN Sub Cluster ID using this command:
# esxcli vsan cluster get
You see output similar to:
Cluster Information
Enabled: true
Current Local Time: YYYY-MM-DDTHH:MM:SS
Local Node UUID: ########-####-####-####-########826f
Local Node Type: NORMAL
Local Node State: AGENT
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: ########-####-####-####-########f17d
Sub-Cluster Backup UUID: ########-####-####-####-########dd93
Sub-Cluster UUID: ########-####-####-####-########9e45
Sub-Cluster Membership Entry Revision: 2
Sub-Cluster Member Count: 3
Sub-Cluster Member UUIDs: ########-####-####-####-########f17d, ########-####-####-####-########dd93, ########-####-####-####-########826f
Sub-Cluster Member HostNames: esxi3.########, esxi2.########, esxi1.########
Sub-Cluster Membership UUID: ########-####-####-####-########f17d
Unicast Mode Enabled: true
Maintenance Mode State: OFF
Config Generation: ########-####-####-####-########d2c2 3 YYYY-MM-DDTHH:MM:SS
Mode: REGULAR
vSAN ESA Enabled: false
- Run one of the commands below on the newly rebuilt ESXi host using the Sub Cluster UUID identified in step 2:
- For vSAN OSA:
# esxcli vsan cluster join -u sub_cluster_UUID
For example:
# esxcli vsan cluster join -u ########-####-####-####-########9e45
- For vSAN ESA:
# esxcli vsan cluster join -x -u sub_cluster_UUID
For example:
# esxcli vsan cluster join -x -u ########-####-####-####-########9e45
- Verify that the host is now a part of the vSAN cluster by running the command:
# esxcli vsan cluster get
You see output similar to:
Cluster Information
Enabled: true
Current Local Time: YYYY-MM-DDTHH:MM:SS
Local Node UUID: ########-####-####-####-########965e
Local Node Type: NORMAL
Local Node State: AGENT
Local Node Health State: HEALTHY
Sub-Cluster Master UUID: ########-####-####-####-########f17d
Sub-Cluster Backup UUID: ########-####-####-####-########dd93
Sub-Cluster UUID: ########-####-####-####-########9e45
Sub-Cluster Membership Entry Revision: 3
Sub-Cluster Member Count: 4
Sub-Cluster Member UUIDs: ########-####-####-####-########f17d, ########-####-####-####-########dd93, ########-####-####-####-########826f, ########-####-####-####-########965e
Sub-Cluster Member HostNames: esxi3.########, esxi2.########, esxi1.########, esxi4.########
Sub-Cluster Membership UUID: ########-####-####-####-########f17d
Unicast Mode Enabled: true
Maintenance Mode State: OFF
Config Generation: ########-####-####-####-########d2c2 4 YYYY-MM-DDTHH:MM:SS
Mode: REGULAR
vSAN ESA Enabled: false
- In the vCenter Server, refresh the vSAN status view. All hosts now report the status as Healthy.
If Quickstart was used to create the vSAN cluster see KB Add a Host to an existing vSAN cluster using Quickstart