Error "FailedPrecondition desc = volume: <volume-id> with existing snapshots […] can’t be expanded" during PVC expansion
search cancel

Error "FailedPrecondition desc = volume: <volume-id> with existing snapshots […] can’t be expanded" during PVC expansion

book

Article ID: 434396

calendar_today

Updated On:

Products

Tanzu Kubernetes Runtime

Issue/Introduction

  • Kubernetes Persistent Volume Claim (PVC) expansion or resize operations fail.

    Warning VolumeResizeFailed persistentvolumeclaim/… resize volume "<volume-id>" by resizer "csi.vsphere.vmware.com" failed: rpc error: code = FailedPrecondition desc = volume: <volume-id> with existing snapshots […] can’t be expanded. Please delete snapshots before expanding the volume
  • When checking the Kubernetes environment using kubectl get volumesnapshots -A, no volume snapshots are present.
  • However, querying the vCenter Server backend datastore reveals that First Class Disk (FCD) snapshots still exist for the affected volumes.
  • This issue frequently occurs in environments utilizing third-party Kubernetes backup solutions such as Velero or Commvault.

Environment

 VMware Tanzu Kubernetes Grid (TKG) 2.x

Cause

  • The issue is due to disconnect between the Kubernetes Custom Resource state and the vCenter Cloud Native Storage (CNS) API, typically triggered by an interrupted backup workflow.
  • During a backup, tools request a VolumeSnapshot, creating a physical FCD snapshot on the vSphere datastore and a corresponding VolumeSnapshotContent object in Kubernetes. To protect the snapshot during active data movement, the backup controller may temporarily patch the VolumeSnapshotContent deletion policy to Retain.
  • If the backup pod crashes, times out, or the vSphere CSI controller loses communication during the cleanup phase, Kubernetes may purge its local VolumeSnapshot object before the vCenter API receives the call to delete the physical backend snapshot.
  • The vSphere CSI driver enforces a precondition that actively blocks volume expansion if any physical snapshots exist on the underlying volume, causing the PVC resize to fail.

Resolution

The orphaned snapshots must be manually removed from the vSphere datastore using the govc CLI.

  1. Identify the CSI Volume Handle (UUID) for the affected PersistentVolume:

    kubectl get pv <pv-name> -o jsonpath='{.spec.csi.volumeHandle}'

  2. Authenticate the govc CLI tool with the underlying vCenter Server.

  3. List the existing orphaned snapshots for the volume on the vSphere datastore to verify their presence:

    govc disk.snapshot.ls -dc="<datacenter-name>" -ds="<datastore-name>" "<volume-id>"

  4. Remove the orphaned snapshot(s) from the backend datastore: govc disk.snapshot.rm -dc="<datacenter-name>" -ds="<datastore-name>" "<volume-id>" "<snapshot-id>"

    Note: To remove multiple snapshots at once, you can pipe the output:

    govc disk.snapshot.ls -dc="<datacenter-name>" -ds="<datastore-name>" "<volume-id>" | awk '{print $1}' | while read snapShot ; do govc disk.snapshot.rm -dc "<datacenter-name>" -ds "<datastore-name>" "<volume-id>" $snapShot ; done

  5. Once the vCenter backend confirms the snapshots are deleted, the vSphere CSI controller will detect the change, the precondition will pass, and the pending PVC expansion will automatically proceed. Verify the PVC status:

    kubectl describe pvc <pvc-name>