Enabling the Velero Operator service provisioning of PVCs in the guest clusters fails
book
Article ID: 323412
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
Symptoms: The following can be observed after the Velero Operator is enabled and a second guest cluster is deployed:
kubectl get events LAST SEEN TYPE REASON OBJECT MESSAGE 3s Normal ExternalProvisioning persistentvolumeclaim/my-pvc waiting for a volume to be created, either by external provisioner "csi.vsphere.vmware.com" or manually created by system administrator
kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE my-pvc Pending tanzu-k8s-custom-policy 41s
kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE tanzu-k8s-custom-policy (default) csi.vsphere.vmware.com Delete Immediate true 23h
You notice in the vmware-system-csi the vsphere-csi-controller-* pod is stucked in ContainerCreating.
kubectl get all -n vmware-system-csi NAME READY STATUS RESTARTS AGE pod/vsphere-csi-controller-66b875d646-95bq5 0/6 ContainerCreating 0 23h pod/vsphere-csi-node-458ds 3/3 Running 0 23h pod/vsphere-csi-node-t82vt 3/3 Running 0 23h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/vsphere-csi-node 2 2 2 2 2 none 23h
NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/vsphere-csi-controller 0/1 1 0 23h
NAME DESIRED CURRENT READY AGE replicaset.apps/vsphere-csi-controller-66b875d646 1 1 0 23h
In the events:
kubectl get events -n vmware-system-csi LAST SEEN TYPE REASON OBJECT MESSAGE 38s Warning FailedMount pod/vsphere-csi-controller-66b875d646-95bq5 MountVolume.SetUp failed for volume "pvcsi-provider-volume" : secret "pvcsi-provider-creds" not found 25m Warning FailedMount pod/vsphere-csi-controller-66b875d646-95bq5 Unable to attach or mount volumes: unmounted volumes=[pvcsi-provider-volume], unattached volumes=[socket-dir vsphere -csi-controller-token-jqqln pvcsi-provider-volume pvcsi-config-volume]: timed out waiting for the condition 45m Warning FailedMount pod/vsphere-csi-controller-66b875d646-95bq5 Unable to attach or mount volumes: unmounted volumes=[pvcsi-provider-volume], unattached volumes=[pvcsi-config-volum e socket-dir vsphere-csi-controller-token-jqqln pvcsi-provider-volume]: timed out waiting for the condition 29m Warning FailedMount pod/vsphere-csi-controller-66b875d646-95bq5 Unable to attach or mount volumes: unmounted volumes=[pvcsi-provider-volume], unattached volumes=[vsphere-csi-contro ller-token-jqqln pvcsi-provider-volume pvcsi-config-volume socket-dir]: timed out waiting for the condition 4m58s Warning FailedMount pod/vsphere-csi-controller-66b875d646-95bq5 Unable to attach or mount volumes: unmounted volumes=[pvcsi-provider-volume], unattached volumes=[pvcsi-provider-vol ume pvcsi-config-volume socket-dir vsphere-csi-controller-token-jqqln]: timed out waiting for the condition
Note:The preceding log excerpts are only examples. Date, time and environmental variables may vary depending on your environment.
Environment
VMware vSphere 7.0.x
Cause
When checking /var/log/pods/vmware-system-tkg_vmware-system-tkg-controller-manager-xxxx/manager/x.log:
2021-04-20T01:47:20.021506556Z stderr F E0420 01:47:20.021328 1 serviceaccount_controller.go:147] vmware-system-tkg-controller-manager/provider-serviceaccount-controller/development/test-velero "msg"="Error ensuring provider serviceaccounts" "error"="unable to sync secret for provider serviceaccount test-velero-pvbackupdriver: namespaces \"velero-vsphere-plugin-backupdriver\" not found" 2021-04-20T01:47:20.022474396Z stderr F E0420 01:47:20.022104 1 controller.go:257] controller-runtime/controller "msg"="Reconciler error" "error"="unable to sync secret for provider serviceaccount test-velero-pvbackupdriver: namespaces \"velero-vsphere-plugin-backupdriver\" not found" "controller"="provider-serviceaccount-controller" "name"="test-velero" "namespace"="development"
Resolution
We are currently working on a fix to be implemented in a future release. (7.0U3 p03)
Workaround: The workaround is to either install the Velero plugin in each guest cluster, or create `velero-vsphere-plugin-backupdriver` namespace in each guest cluster. After that, the controller will eventually create all the secrets.
Impact/Risks: After enabling the Velero Operator you cannot deploy any PVCs. Every new Deployment of a new TKG Cluster hits the same issue from then on.