Seeing inaccessibile vSAN Object and Network Partition
search cancel

Seeing inaccessibile vSAN Object and Network Partition

book

Article ID: 380704

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

In vCenter vSAN Skyline Health, you see errors vSAN cluster partition and vSAN object health. For detailed description, of error definitions see: vSAN Health Service - Physical Disk Health - Operation Health.

 

Environment

vSAN 8 (ESA)

vSAN 9 (ESA)

Cause

vSAN ESA cluster was formerly using data-at-rest encryption, which was removed. 

 

In vmkernel.log you may be seeing lost referring to host RDT Network and failure to Rekey: 

 

YYYY-MM-DDTHH:MM:SS.###Z In(182) vmkernel: cpu3:24961570)RDT: RDT_VmklinkAuthProposeKey:403: RDT_VmklinkAuthProposeKey done. Key uuid: ##############################
YYYY-MM-DDTHH:MM:SS.###Z Wa(180) vmkwarning: cpu3:24961570)WARNING: RDT: RDTEncrBuildNodeCtx:1846: Failed to propose the key to remote node ########-####-####-####-############: Failure
YYYY-MM-DDTHH:MM:SS.###Z In(182) vmkernel: cpu36:2100263)osfs: OSFS_GetMountPointList:3748: mountPoints[0] inUse pid [    vsan], cid ##############################
YYYY-MM-DDTHH:MM:SS.###Z In(182) vmkernel: cpu38:24961909)osfs: OSFS_GetMountPointList:3748: mountPoints[0] inUse pid [    vsan], cid ##############################
YYYY-MM-DDTHH:MM:SS.###Z In(182) vmkernel: cpu34:2098788)RDT: RDTKeyMaintainerWorld:2796: Rekey client key for node ########-####-####-####-############ 
YYYY-MM-DDTHH:MM:SS.###Z In(182) vmkernel: cpu34:2098788)RDT: RDTKeyMaintainerWorld:2796: Rekey client key for node ########-####-####-####-############ 
YYYY-MM-DDTHH:MM:SS.###Z In(182) vmkernel: cpu5:24962097)RDT: RDTEncrRekeyWorldMain:2440: Rekey helper world to negotiate client key for remote node ########-####-####-####-############

Resolution

Identify the host causing the partition, by clicking troubleshoot on the vSAN cluster partition error. This will show you a screen similar to the following: 

 

In the image above, the host causing the partition is identified by value listed in the Partition column. The hosts listed with 1 are in network group 1, while the host listed with 2 is in network group 2. The third host in this example is partitioned. 

 

Once you've identified the partitioned host, place the host into maintenance mode using ensure accessibility. If this task fails, choose the option with No data migration. Only when the host has successfully entered maintenance mode, move the host out of the cluster to the Data Center level in inventory. Either drag and drop or right click the host and click Move.  If the host doesn't successfully enter maintenance mode, collect vCenter and ESXi host logs and open a support request. 

 

After moving the host, allow the vSAN cluster to update the configuration in recent tasks. Once vSAN configuration are done updating in recent tasks, add the host back to the cluster. Once the cluster updates the configuration, exit maintenance mode.