vSphere-csi-controller, vsphere-syncer and liveness probe containers running in vmware-system-csi namespace.
csi-attachter, csi-provisioner, csi-resizer and csi-snapshotter containers stuck in CrashloopBackoff.
csi-attacher, csi-resizer logs contain entries similar to:
YYYY-MM-DDTHH:MM:SS.MSZ stderr F I0410 HH:MM:SS.MS 1 connection.go:253] "Still connecting" address="unix:///csi/csi.sock"YYYY-MM-DDTHH:MM:SS.MSZ stderr F E0410 HH:MM:SS.MS 1 main.go:156] "Failed to connect to the CSI driver" err="context deadline exceeded" csiAddress="/csi/csi.sock"
vsphere-csi-controller logs contain entries similar to:
YYYY-MM-DDTHH:MM:SS.MSZ stderr F E0410 HH:MM:SS.MS 1 reflector.go:205] "Failed to watch" err="failed to list topology.tanzu.vmware.com/v1alpha1, Resource=zones: zones.topology.tanzu.vmware.com is forbidden: User \"system:serviceaccount:namespacename:XXX-XXX-pvcsi\" cannot list resource \"zones\" in API group \"topology.tanzu.vmware.com\" in the namespace \"XXXX-NS\"" logger="UnhandledError" reflector="pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:290" type="topology.tanzu.vmware.com/v1alpha1, Resource=zones"
vSphere with Tanzu 8.0
Permissions / roles missing for the user that is executing the command on the pod.
Execute the following steps against the Supervisor Cluster to grant the required permissions to the CSI service account.
1. Apply RBAC Patch Create a Role and RoleBinding to grant the missing topology permissions within the target Supervisor namespace.
Save the following as pvcsi-rbac-patch.yaml:
apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata: name: pvcsi-topology-reader namespace: namespacenamerules:- apiGroups: ["topology.tanzu.vmware.com"] resources: ["zones"] verbs: ["get", "list", "watch"]---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata: name: pvcsi-topology-reader-binding namespace: namespacenamesubjects:- kind: ServiceAccount name: XXX-XXX-pvcsi namespace: namespacenameroleRef: kind: Role name: pvcsi-topology-reader apiGroup: rbac.authorization.k8s.io
2.Apply the configuration to the Supervisor Cluster:kubectl apply -f pvcsi-rbac-patch.yaml
3.Restart the Component Switch context to the Guest Cluster and force the CSI controller pods to restart.
This will clear the active error loop and force the pods to re-authenticate with the updated RBAC token.
kubectl rollout restart deployment vsphere-csi-controller -n vmware-system-csi
4.Verify Resolution Monitor the logs of the newly spawned vsphere-csi-controller pods in the Guest Cluster to confirm the watch operation succeeds.
kubectl logs -l app=vsphere-csi-controller -c csivsphere -n vmware-system-csi --tail=50