vVols datastore do not automatically recover or connect back to PDL affected host
search cancel

vVols datastore do not automatically recover or connect back to PDL affected host

book

Article ID: 321021

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article provides required steps to successfully recover the vVols datastore after PDL events, which might be impacted due to the placement of vCLS VMs.

Symptoms:
  • vVols datastores do not automatically recover (or connect back) on affected hosts after the PDL event remediation.
  • When there are storage issues such as a PDL or All Paths Down (APD) with vVols datastore on some hosts, these datastores do not automatically connect back to few ESXi hosts post recovery from PDL events. 
  • PE becomes inaccessible in the vCenter Server UI for the few hosts in cluster.
  • Users are not able to deploy new VMs or migrate VMs to a particular vVols datastore on a few ESXi hosts.


Environment

VMware vSphere ESXi 7.0.0

Cause

These issue occurs when there are storage issues (For example: A Permanent Device Loss (PDL) or an All Paths Down (APD) with vVols datastore and if vCLS VMs are residing in this datastore, the vCLS VMs fails to terminate even if the advanced option of VMkernel.Boot.terminateVMOnPDL is set on the hosts.

In such scenario, vCLS VMs might be visible in the vSphere Client as powered-on but the underlying disc is actually inaccessible. This causes the datastore to not un-register from the host. Further causes the failure of automatically re-registering of datastore back to the host after PDL situation gets remediated.

Resolution

To resolve this issue, invoke Retreat Mode to get the vVols datastore connected back to the ESXi hosts. This will power-off the vCLS VM in the cluster.  Once the vCLS VMs gets powered off, the datastore will automatically re-register to the host successfully.

To use Retreat Mode to remove vSphere Cluster Service VMs from a cluster. First enable Retreat Mode in the cluster so that all the vCLS VMs in the cluster gets deleted. If in case the same datastore is shared between multiple clusters and the vCLS VMs from different clusters are placed in this same datastore, then in all the clusters where the corresponding cluster vCLS VMs are placed in this datastore should be enabled with Retreat Mode. Also when this Retreat Mode is enabled, the DRS in that particular clusters will be non-functional until the Retreat Mode is disabled back. Once the datastore re-registration is successful, the Retreat Mode should be disabled in all clusters connected to this datastore where its enabled in order to get back the vCLS VMs in all the cluster for DRS functionality.