When sshed into a worker node and run monit summary
you see:
Process 'containerd' running
Process 'kubelet' running
Process 'kube-proxy' running
Process 'disk-pressure-watch' running
Process 'csi-node-registrar' Does not exist
Process 'csi-node' running
Process 'csi-livenessprobe' running
Process 'blackbox' running
Process 'nsx-node-agent' running
Process 'ovsdb-server' running
Process 'ovs-vswitchd' running
Process 'nsx-kube-proxy' running
Process 'telegraf' running
Process 'node_exporter' running
Process 'bosh-dns' running
Process 'bosh-dns-resolvconf' running
Process 'bosh-dns-healthcheck' running
Process 'system-metrics-agent' running
The log /var/vcap/sys/log/csi-node-service/csi-node-driver-registrar.stderr.log
you see this error:
I0127 20:07:29.138316 338074 main.go:121] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:false,Error:RegisterPlugin error -- plugin registration failed with err: rpc error: code = Internal desc = failed to fetch node object with name "######-####-####-####-#########". Error: nodes "######-####-####-####-#########" not found,}
E0127 20:07:29.138333 338074 main.go:123] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: rpc error: code = Internal desc = failed to fetch node object with name "######-####-####-####-#########". Error: nodes "######-####-####-####-#########" not found, restarting registration container.
TKGI: 1.19.x
Network disconnects between nodes might cause this condition.
Run monit restart csi-node
and csi-registrar will come back up.
If it does not come back up please open a SR with Broadcom support team.