Failed to expand volume because the disk that backs it has snapshots
search cancel

Failed to expand volume because the disk that backs it has snapshots

book

Article ID: 313099

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vSphere Kubernetes Service VMware Tanzu Kubernetes Grid Integrated Edition VMware Tanzu Kubernetes Grid VMware Tanzu Kubernetes Grid Management

Issue/Introduction

7s (x601 over 41h)  Warning  VolumeResizeFailed  PersistentVolumeClaim/data-postgresql-0  resize volume "pvc-########-####-####-0000-############" by resizer "csi.vsphere.vmware.com" failed: rpc error: code = Internal desc = failed to expand volume ########-####-####-####-############-########-####-####-0000-############ in namespace prodns of supervisor cluster. Error: supervisor persistentVolumeClaim ########-####-####-0000-############-########-####-####-0000-############ in namespace prodns not in "FileSystemResizePending" condition within 240 seconds

  • Logs from csi-attacher using "kubectl logs csi-controller-<pod> -n vmware-system-csi -c csi-attacher":

{"level":"error","time":"yyyy-mm-ddThh:mm:ssZ","caller":"wcpguest/controller.go:1237","msg":"failed to update supervisor PVC \"<pvcId>\" in \"<namespace>\" namespace. Error: admission webhook \"validation.csi.vsphere.vmware.com\" denied the request: Expanding volume with snapshots is not allowed","TraceId":"<trace_id>","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service/wcpguest.

  • The disk that backs a volume has snapshots that have been taken by a snapshot-based backup solution.

$ govc disk.snapshot.ls -k -dc="Datacenter-1" -ds="Tanzu-vmfssan" -l "########-####-####-0000-############"
########-####-####-1111-############ kanister.fcd.description:########-####-####-2222-############ MM DD HH:MM:SS
########-####-####-3333-############ kanister.fcd.description:########-####-####-4444-############ MM DD HH:MM:SS

 

Environment

  • VMware vSphere 7.0 with Tanzu
  • TKGi/TKGm: 2.5.1
  • ESXi: 8.0U3
 

Cause

If a virtual disk that backs a volume has snapshots, it cannot be resized.

Resolution

  1. Get the Volume ID of the problematic volume

    $ kubectl get pv pvc-f865d13c-c0c4-46fc-828b-aeebc4a649fe -o json | jq .spec.csi.volumeHandle

  2. List and delete snapshots

For linux: 

    1. Download & extract the govc binary on a Linux server which can access the VC.

      $ wget https://github.com/vmware/govmomi/releases/download/v0.32.0/govc_Linux_x86_64.tar.gz

      $ tar -zxf govc_Linux_x86_64.tar.gz

      Note: Latest release available here: govmomi/releases

    2. Move the govc binary to the user directory.

      $ sudo mv govc /usr/local/bin/

    3. Validate the govc tool installation.

      $ which govc

      $ govc version

    4. Define Env variables to connect to VC.

      $ export GOVC_URL=<vcenterFqdn>

      $ export GOVC_USERNAME=<adminUser>

      $ export GOVC_PASSWORD=<adminPassword>

      $ export GOVC_INSECURE=true

    5. List all the snapshots with the volume ID

      $ govc disk.snapshot.ls -k -dc=<datacenterName> -ds=<datastoreName> -l <volumeId>

      Example:

      $ govc disk.snapshot.ls -k -dc="Datacenter-1" -ds="Tanzu-vmfssan" -l "########-####-####-0000-############"

      ########-####-####-1111-############ kanister.fcd.description:########-####-####-2222-############ MM DD HH:MM:SS

      ########-####-####-3333-############ kanister.fcd.description:########-####-####-4444-############ MM DD HH:MM:SS

    6. Delete the snapshots. It is not required to follow any sequence while deleting them. You can delete the snapshots one after another in any random sequence.

$ govc disk.snapshot.rm -dc <datacenterName> -ds <datastoreName> <volumeId>  <snapshotName>

Example:

$ govc disk.snapshot.rm -dc="Datacenter-1" -ds="Tanzu-vmfssan" -l "########-####-####-0000-############" "########-####-####-1111-############"

[DD-MM-YY HH:MM:SS] Deleting ########-####-####-1111-############..OK 

 For windows : 

    1. Download the appropriate govc_Windows_arm from govmomi/releases to obtain the govc_.exe file.

    2. Launch PowerShell and navigate to the extracted folder

    3. List all the snapshots with the <volumeId>

      $ .\govc.exe disk.snapshot.ls -k -u <adminUser>:<adminPassword>@host -dc=<datacenterName> -ds=<datastoreName> -l <volumeId>

      -k : Skip verification of server certificate

      -u : vCenter or ESXi URL to be specified with username and password to be used for connection

      -dc : Datacenter name from the vCenter inventory

      -ds : Name of the datastore in use

      Example :

      .\govc.exe disk.snapshot.ls -k -u [email protected]:####@vCenter-FQDN -dc="Datacenter-1" -ds="Tanzu-vmfssan" -l "########-####-####-0000-############"

      ########-####-####-1111-############ kanister.fcd.description:########-####-####-2222-############  MM DD HH:MM:SS

      ########-####-####-3333-############  kanister.fcd.description:########-####-####-4444-############  MM DD HH:MM:SS

    4. Delete the snapshots. It is not required to follow any sequence while deleting them. You can delete the snapshots one after another in any random sequence.

$ .\govc.exe disk.snapshot.rm -k -u [email protected]:####@vCenter-FQDN -dc="Datacenter-1" -ds="Tanzu-vmfssan" ########-####-####-1111-############

[DD-MM-YY HH:MM:SS] Deleting ########-####-####-1111-############..OK%)

Additional Information

Bulk snapshot delete

Note: Deleting the snapshots manually from the vSphere side while they haven't been deleted from the backup solution side may cause discrepancies in the backup solution database.

  • The customer must confirm that those snapshots are orphaned and have already been deleted from the backup solution side.
  • If he needs help to confirm, he should engage the backup solution vendor.

$ govc disk.snapshot.ls -dc="<datacenterName>" -ds="<datastoreName>" "<cns-volumeUuid>" | awk '{print $1}' | while read snapShot ; do govc disk.snapshot.rm -dc "<datacenterName>" -ds "<datastoreName>" "<cns-volumeUuid>" $snapShot ; done

Example:

$ govc disk.snapshot.ls -dc="Datacenter-1" -ds="Tanzu_vmfssan" "########-####-####-0000-############" | awk '{print $1}' | while read snapShot ; do govc disk.snapshot.rm -dc "Datacenter-1" -ds "Tanzu_vmfssan" "########-####-####-0000-############" $snapShot ; done
[18-10-23 09:28:12] Deleting ########-####-####-1111-############...OK
[18-10-23 09:45:33] Deleting ########-####-####-2222-############...OK