VolumeSnapshot deletion stuck with error "Snapshot is being used to restore a PVC" on the Supervisor after a failed backup
search cancel

VolumeSnapshot deletion stuck with error "Snapshot is being used to restore a PVC" on the Supervisor after a failed backup

book

Article ID: 424781

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere Kubernetes Service

Issue/Introduction

  • A failed backup operation left a stale VolumeSnapshot on the Guest Cluster.

  • Deleting the VolumeSnapshot on the Guest Cluster hangs indefinitely.

  • The VolumeSnapshotContent on the Guest Cluster displays the generic warning: Warning SnapshotDeleteError ... csi-snapshotter csi.vsphere.vmware.com Failed to delete snapshot

  • There are no PVCs on the Guest Cluster that currently reference the stuck VolumeSnapshot.

  • Inspecting the corresponding VolumeSnapshot on the vSphere Supervisor Cluster reveals the following warning event: Warning SnapshotDeletePending ... snapshot-controller Snapshot is being used to restore a PVC

Environment

VKS Guest Cluster with Commvault Backup

Cause

The VolumeSnapshot cannot be deleted because a stale, pending PersistentVolumeClaim (PVC) on the vSphere Supervisor Cluster is still referencing the snapshot as its data source.
This typically occurs when a backup operation (such as Commvault) times out, leaving the temporary PVC behind on the Supervisor even though the corresponding resources were cleaned up on the Guest Cluster.

Resolution

Follow these steps to identify the specific resource ID and remove the stale dependency on the vSphere Supervisor Cluster.

1. Retrieve the Snapshot Handle from the Guest Cluster

On the Guest Cluster, identify the VolumeSnapshotContent object that is stuck in deletion. Run the following command to extract the Snapshot Handle.
This ID represents the actual name of the VolumeSnapshot resource on the Supervisor Cluster.

# Replace <snapcontent-name> with the name from your error logs
kubectl get volumesnapshotcontent <snapcontent-name> -o jsonpath='{.status.snapshotHandle}'

Note the output ID. It will look similar to this format: ########-####-####-####-############-########-####-####-####-############

2. Verify the Lock on Supervisor VolumeSnapshot

Log in to the vSphere Supervisor Cluster.
Use the ID obtained in Step 1 to inspect the snapshot resource.

kubectl describe volumesnapshot -n <supervisor-namespace> <snapshot-handle-id>

Confirm the presence of the warning event: Warning SnapshotDeletePending ... Snapshot is being used to restore a PVC

3. Identify the Stale PVC on the Supervisor Cluster


You must find the specific Pending PVC that is referencing this snapshot handle.
Run the following command to dump the PVCs and search for the handle ID.

kubectl get pvc -n <supervisor-namespace> -o yaml | grep <snapshot-handle-id> -B 15 -A 5

Alternatively, you can open the full list in a pager:

kubectl get pvc -n <supervisor-namespace> -o yaml | less

Look for a PVC entry where status.phase is Pending and the spec.dataSource.name matches your snapshot handle:

spec:
  dataSource:
    apiGroup: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    # The name below will match the Snapshot Handle ID found in Step 1
    name: ########-####-####-####-############-########-####-####-####-############ 
status:
  phase: Pending

4. Delete the pending snapshot with the below command to retrigger a snapshot consolidation action.

Note that this action will take some time until the data is consolidated, the larger the size of the volumesnapshot, the longer time it will take. You will also see a snapshot deletion event triggerred in the vCenter.

kubectl delete volumesnapshot -n <namespace> <guest-cluster-snapshot-name> 

5. Delete the Pending PVC Delete the stale PVCs identified in Step 3.

kubectl delete pvc -n <supervisor-namespace> <pvc-name>

6. Verification

Once the PVC is deleted, the Supervisor will automatically release the lock on the snapshot after a timeout.
The VolumeSnapshot will disappear from the Supervisor, and subsequently, the stuck VolumeSnapshotContent on the Guest Cluster will be successfully deleted.

Additional Information

Commvault Documentation :
Backup Process for Kubernetes