vSphere CSI Driver Integration doc - https://docs.pivotal.io/tkgi/1-12/vsphere-cns.html#uninstall-csi.
After enabling the vSphere CSI Driver Integration on TKGI tile and applying change successfully, when we try to upgrade a cluster, it fails on a worker node with csi-node-registrar being down. Below is the error it shows in csi-node-registrar stderr logs:
I0721 08:29:18.240065 10505 main.go:113] Version: v2.1.0-0-g80d42f2 I0721 08:29:18.240719 10505 main.go:137] Attempting to open a gRPC connection with: "/var/vcap/data/kubelet/plugins/csi.vsphere.vmware.com/csi.sock" I0721 08:29:18.240737 10505 connection.go:153] Connecting to unix:///var/vcap/data/kubelet/plugins/csi.vsphere.vmware.com/csi.sock I0721 08:29:18.241191 10505 main.go:144] Calling CSI driver to discover driver name I0721 08:29:18.241213 10505 connection.go:182] GRPC call: /csi.v1.Identity/GetPluginInfo I0721 08:29:18.241219 10505 connection.go:183] GRPC request: {} I0721 08:29:18.243326 10505 connection.go:185] GRPC response: {"name":"csi.vsphere.vmware.com","vendor_version":"v2.3.0"} I0721 08:29:18.243381 10505 connection.go:186] GRPC error: <nil> I0721 08:29:18.243389 10505 main.go:154] CSI driver name: "csi.vsphere.vmware.com" I0721 08:29:18.243462 10505 node_register.go:52] Starting Registration Server at: /var/vcap/data/kubelet/plugins_registry/csi.vsphere.vmware.com-reg.sock I0721 08:29:18.243607 10505 node_register.go:61] Registration Server started at: /var/vcap/data/kubelet/plugins_registry/csi.vsphere.vmware.com-reg.sock I0721 08:29:18.243660 10505 node_register.go:86] Starting healthz server at HTTP endpoint: :9809 F0721 08:29:18.248887 10505 node_register.go:105] listen tcp :9809: bind: address already in use goroutine 4 [running]:
If a manual CSI driver is being used, it'll occupy port 9809 which is the default port for csi-node-registrar. So the error "listen tcp :9809: bind: address already in use" is expected because the manual CSI installation conflicts with the automatic CSI installation.
Follow the steps below to workaround this issue:
1. Change the port 9809 to a different port value (such as 9909) in the manual CSI yaml file - https://github.com/kubernetes-sigs/vsphere-csi-driver/blob/v2.3.1/manifests/vanilla/vsphere-csi-driver.yaml.- name: liveness-probe image: quay.io/k8scsi/livenessprobe:v2.2.0 args: - "--v=4" - "--csi-address=/csi/csi.sock" volumeMounts: - name: plugin-dir mountPath: /csi
3. Apply the manifest after making the above changes. 4. Switch the manual CSI installation to automatic CSI installation per the guide https://docs.pivotal.io/tkgi/1-12/vsphere-cns.html#uninstall-csi
It won't let the csi-node-registrar process to start. The node status will be showing as "Failing" and it'll not let the upgrade to be completed.