vSphere CSI Node pods on VKS cluster stuck in CLBO state with "Unauthorized" error
search cancel

vSphere CSI Node pods on VKS cluster stuck in CLBO state with "Unauthorized" error

book

Article ID: 411871

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

Newly created vSphere CSI node pods in VKS cluster are stuck in CLBO state with the following error:

failed to retrieve topology information for Node: "<node-name>". Error: "failed to fetch topology information for the worker node \"<node-name>\". Error: failed to get VirtualMachines for the node: \"<node-name>\". Error: Unauthorized",}

Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: rpc error: code = Internal desc = failed to retrieve topology information for Node: "<node-name>". Error: "failed to fetch topology information for the worker node \"<node-name>\". Error: failed to get VirtualMachines for the node: \"<node-name>\". Error: Unauthorized", restarting registration container.

Environment

vSphere Kubernetes Service 

Cause

CSI controller deployment in VKS cluster connects to supervisor cluster by using a token which is present in the secret pvcsi-provider-creds on VKS cluster. If CSI is not able to load the latest secret having the correct credentials, then all connections from CSI in VKS cluster to supervisor cluster will fail with unauthorized error.

Resolution

Workaround:

Restart vSphere CSI driver deployment in VKS cluster and wait for the pods to come to Running state.

kubectl rollout restart deployment vsphere-csi-controller -n vmware-system-csi

The vsphere-csi-node pods stuck in CLBO will also come to running state eventually.