This KB should help to know how to manually delete the orphaned snapshots that are attached to an FCD.
Symptoms:
- Volume expansion on a guest cluster fails.
7s (x601 over 41h) Warning VolumeResizeFailed PersistentVolumeClaim/data-postgresql-0 resize volume "pvc-3b0cf013-2ec9-4177-924c-f71ce25f91a6" by resizer "csi.vsphere.vmware.com" failed: rpc error: code = Internal desc = failed to expand volume b83ce058-e082-4cdb-bfc3-6219bda0fdc4-3b0cf013-2ec9-4177-924c-f71ce25f91a6 in namespace prodns of supervisor cluster. Error: supervisor persistentVolumeClaim b83ce058-e082-4cdb-bfc3-6219bda0fdc4-3b0cf013-2ec9-4177-924c-f71ce25f91a6 in namespace prodns not in "FileSystemResizePending" condition within 240 seconds
- The disk that backs a volume has snapshots that have been taken by a snapshot-based backup solution.
$ govc disk.snapshot.ls -k -dc="Datacenter-1" -ds="Tanzu-vmfssan" -l "008d3b91-1756-49cd-9db3-fbf4699887fd"
9fc48a39-3be2-4183-b645-ebbf4efb2467 kanister.fcd.description:6c254147-66e6-11ee-af2f-e6b8f1567dd3 MM DD HH:MM:SS
4b81583d-ccea-44e2-a577-4305e82ad02a kanister.fcd.description:e7634189-5fd3-11ee-af2f-e6b8f1567dd3 MM DD HH:MM:SS
If a virtual disk that backs a volume has snapshots, it cannot be resized.
Get the Volume ID of the problematic volume. Check the first point in the resolution section of KB 305322.
Note: The below steps are also valid for TKGi/TKGm. To get the Volume ID, get the volume in JSON format and identify the VolumeHandle.
$ kubectl get pv pvc-f865d13c-c0c4-46fc-828b-aeebc4a649fe -o json | jq .spec.csi.volumeHandle
For linux :
$ wget https://github.com/vmware/govmomi/releases/download/v0.32.0/govc_Linux_x86_64.tar.gz
$ tar -zxf govc_Linux_x86_64.tar.gz
Note: you can get the latest govc release from the following page:
https://github.com/vmware/govmomi/releases
$ sudo mv govc /usr/local/bin/
$ which govc
$ govc version
$ export GOVC_URL=<vCenter_FQDN>
$ export GOVC_USERNAME=<[email protected]>
$ export GOVC_PASSWORD=<administrator_password>
$ export GOVC_INSECURE=true
$ govc disk.snapshot.ls -k -dc=<datacenter-name> -ds=<datastore-name> -l <volume-id>
Example:
$ govc disk.snapshot.ls -k -dc="Datacenter-1" -ds="Tanzu-vmfssan" -l "008d3b91-1756-49cd-9db3-fbf4699887fd"
9fc48a39-3be2-4183-b645-ebbf4efb2467 kanister.fcd.description:6c254147-66e6-11ee-af2f-e6b8f1567dd3 MM DD HH:MM:SS
4b81583d-ccea-44e2-a577-4305e82ad02a kanister.fcd.description:e7634189-5fd3-11ee-af2f-e6b8f1567dd3 MM DD HH:MM:SS
$ govc disk.snapshot.rm -dc <datacenter-name> -ds <datastore-name> <volume-id> <snapshot-name>
Example:
$ govc disk.snapshot.rm -dc="Datacenter-1" -ds="Tanzu-vmfssan" -l "008d3b91-1756-49cd-9db3-fbf4699887fd" "9fc48a39-3be2-4183-b645-ebbf4efb2467"
[DD-MM-YY HH:MM:SS] Deleting 9fc48a39-3be2-4183-b645-ebbf4efb2467...OK
For windows :
$ .\govc.exe disk.snapshot.ls -k -u user:password@host -dc=<datacenter-name> -ds=<datastore-name> -l <volume-id>
-k : Skip verification of server certificate
-u : vCenter or ESXi URL to be specified with username and password to be used for connection
-dc : Datacenter name from the vCenter inventory
-ds : Name of the datastore in use
.\govc.exe disk.snapshot.ls -k -u [email protected]:*********@vCenter-FQDN -dc='example-datacenter' -ds='example-datastore-name-1' -l 'volume-id-from step-1'
$ .\govc.exe disk.snapshot.rm -k -u [email protected]:***********@vCenter-FQDN -dc='example-datacenter' -ds='datastore-name-01' 3c5394ca-4592-4384-ae7f-a162fce93fb8
[DD-MM-YY HH:MM:SS] Deleting 3c5394ca-4592-4384-ae7f-a162fce93fb8...OK%)
Deleting the snapshots manually from the vSphere side while they haven't been deleted from the backup solution side may cause discrepancies in the backup solution database. Thus, the customer must confirm that those snapshots are orphaned and have already been deleted from the backup solution side. If he needs help to confirm, he should engage the backup solution vendor.