Updating Configured Datacenter in CSI for a Class-Based TKGm 2.3.x Workload Cluster

search cancel

Updating Configured Datacenter in CSI for a Class-Based TKGm 2.3.x Workload Cluster

book

Article ID: 375924

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Management

Issue/Introduction

When the datacenter name is changed in vSphere, or manually updated on a TKGm cluster, you may find that any modifications to the CSI secrets containing the datacenter value are not reconciled or updated.

You may encounter an error similar to the one below in the node-driver-registrar container of the vsphere-csi-node pod on the affected cluster:

E0821 09:26:04.655989 1 main.go:123] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: rpc error: code = Internal desc = failed to retrieve topology information for Node: "test-cluster-k7uic-7dd856kl9jsgxk-qdayj". Error: "failed to retrieve nodeVM \"424kdll2-4h51-8751-a129-7fjk8oips6as\" using the node manager. Error: datacenter '/datacenter0' not found", restarting registration container

You will also observe that all pods in the csi-node DaemonSet are in a CrashLoopBackOff state.

Cause

Due to a bug in this version of TKGm, modifications to <cluster-name>-vsphere-csi-data-values on the management cluster are not properly propagated to CSI secrets on the workload cluster. This includes the vsphere-config-secret in the kube-system namespace, which is also not updated correctly. As a result, CSI pods will remain in the CrashLoopBackOff state when changes are made to the secret.

Resolution

1. In management cluster context, list vsphereCSIConfig objects in the default namespace:

kubectl get vsphereCSIConfig

2. Edit the vsphereCSIConfig associated with the workload cluster where the issue is observed:

kubectl edit vsphereCSIConfig test-cluster-1

3. Modify the "datacenter" field with the correct name and save

spec: vsphereCSI: config: datacenter: /new-datacenter-id

After completing this step, all secrets should be updated to reflect the newly configured datacenter. Ensure that you follow steps 4 and 5 to apply these changes to the pods.

4. In workload cluster context, restart all vsphere-csi-controller pods

kubectl rollout restart deployment vsphere-csi-controller -n kube-system

5. Restart the vsphere-csi-node daemonset

kubectl rollout restart daemonset -n kube-system vsphere-csi-node

All vsphere-csi-node pods should now be in a running state and using the updated datacenter configuration.

Feedback

thumb_up Yes

thumb_down No