Newly created vSphere CSI node pods in VKS cluster are stuck in CLBO state with the following error:
failed to retrieve topology information for Node: "<node-name>". Error: "failed to fetch topology information for the worker node \"<node-name>\". Error: failed to get VirtualMachines for the node: \"<node-name>\". Error: Unauthorized",}
Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: rpc error: code = Internal desc = failed to retrieve topology information for Node: "<node-name>". Error: "failed to fetch topology information for the worker node \"<node-name>\". Error: failed to get VirtualMachines for the node: \"<node-name>\". Error: Unauthorized", restarting registration container.
vSphere Kubernetes Service
CSI controller deployment in VKS cluster connects to supervisor cluster by using a token which is present in the secret pvcsi-provider-creds on VKS cluster. If CSI is not able to load the latest secret having the correct credentials, then all connections from CSI in VKS cluster to supervisor cluster will fail with unauthorized error.
Workaround:
Restart vSphere CSI driver deployment in VKS cluster and wait for the pods to come to Running state.
kubectl rollout restart deployment vsphere-csi-controller -n vmware-system-csi
The vsphere-csi-node pods stuck in CLBO will also come to running state eventually.