The Persistent Volume Claim (PVC) and Pod in the workload cluster are deleted but Persistent Volume (PV) is stuck in deleting state with "Released" status, even though the reclaim policy is set to "Delete".
For example, database cluster status will show up as follows (i.e. persistent volume still exists when it should have been removed):
status: alertLevel: WARNING conditions: - lastTransitionTime: "2025-07-29T23:53:07Z" message: "" observedGeneration: 2 reason: Deleting status: "False" type: Ready - lastTransitionTime: "2025-07-29T23:54:23Z" message: |- waiting for volumes to be removed All attempts fail: #1: persistent volume pvc-########-####-####-9454-############ still exists when it should have been removed #2: persistent volume pvc-########-####-####-9454-############ still exists when it should have been removed #3: persistent volume pvc-########-####-####-9454-############ still exists when it should have been removed #4: persistent volume pvc-########-####-####-9454-############ still exists when it should have been removed #5: persistent volume pvc-########-####-####-9454-############ still exists when it should have been removed observedGeneration: 2 reason: Failed status: "False" type: Provisioning - lastTransitionTime: "2025-07-29T23:31:53Z"
In workload cluster the PVC is gone, but the PV still exists:
kubectl get pvNAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGEpersistentvolume/pvc-########-####-####-9454-############ 20Gi RWO Delete Released new-namespace/#####-######-######-#####-######-# <storage class> <unset> 13h
VMware Data Services Manager 9.x
A PV in a workload cluster has a corresponding PVC in the Supervisor. The deletion of a PV in a workload cluster depends on the deletion of the corresponding Supervisor PVC to work properly. In some rare cases, the Supervisor PVC fails to respond to its workload cluster PV deletion event.
We need to execute different commands from 3 different environment:
(Please ensure the correct command is executed in the correct environment)
1) In the workload cluster:
We need to find the corresponding Supervisor PVC of the stuck PV. Execute `kubectl get` command against workload cluster. From the output, the `volumeHandle` is the field we are looking for, it is the name of the corresponding Supervisor PVC.
kubectl get persistentvolume/pvc-########-####-####-9454-############ -oyamlapiVersion: v1kind: PersistentVolumemetadata: annotations: pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.com volume.kubernetes.io/provisioner-deletion-secret-name: "" volume.kubernetes.io/provisioner-deletion-secret-namespace: "" creationTimestamp: "2025-07-29T23:41:34Z" finalizers: - kubernetes.io/pv-protection - external-attacher/csi-vsphere-vmware-com name: pvc-########-####-####-9454-############ resourceVersion: "5508" uid: ########-####-####-####-############ spec: accessModes: - ReadWriteOnce capacity: storage: 20Gi claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: #####-######-######-#####-######-# namespace: <namespaceName> resourceVersion: "3153" uid: ########-####-####-9454-############ csi: driver: csi.vsphere.vmware.com fsType: ext4 volumeAttributes: storage.kubernetes.io/csiProvisionerIdentity: 1753832251571-754-csi.vsphere.vmware.com type: vSphere CNS Block Volume volumeHandle: -########-####-####-b3f4-####################-####-####-9454-############ nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - domain-c10 persistentVolumeReclaimPolicy: Delete storageClassName: dsm-test-latebinding volumeMode: Filesystemstatus: lastPhaseTransitionTime: "2025-07-29T23:53:12Z" phase: Released
In this example, it is ``. -########-####-####-b3f4-####################-####-####-9454-############
In the Supervisor:
Then we need to switch to the Supervisor environment, and delete the PVC from there.
(using above example name here, you need to change the PVC name accordingly)kubectl delete pvc ########-####-####-b3f4-############-########-####-####-9454-############
In the workload cluster:
The next step is to restart the vsphere-csi-controller, we need to execute the following command in workload cluster again.
kubectl rollout restart deployment/vsphere-csi-controller -nvmware-system-csi
In the DSM Provider VM:
The provisioner running in ProviderVM will keep trying the reconciliation. After finishing the above steps, if you wait for a certain period of time(such as 10-20mins), you should see the database cluster be cleaned completely.
If not, ssh into the Provider VM, from there we can restart the provisioner process with below command. That will trigger the reconciliation immediately.
systemctl restart dsm-tsql-provisioner
After the above steps, the PV stuck in deleting state should be cleaned up completely.