infraclassifier or other
job stuck in pending state and vsphere-csi-controller
in CrashLoopBackOff
.k get pods -A | egrep -vi "runn|compl"
NAMESPACE NAME READY STATUS RESTARTS AGE
nsxi-platform anomalydetectionstreamingjob-3bf6e796f3c9518d-exec-1 0/1 Pending 0 6d3h
nsxi-platform infraclassifier-8646eb9646a0565e-exec-1 0/1 Pending 0 39d
nsxi-platform infraclassifier-8646eb9646a0565e-exec-2 0/1 Pending 0 39d
nsxi-platform infraclassifier-8646eb9646a0565e-exec-3 0/1 Pending 0 39d
nsxi-platform infraclassifier-8646eb9646a0565e-exec-4 0/1 Pending 0 39d
vmware-system-csi vsphere-csi-controller-75f8894c79-sdhck 6/7 CrashLoopBackOff 14105 (4m33s ago) 54d
infraclassifier
pod shows below error,k describe pod infraclassifier-8646eb9646a0565e-exec-1 -n nsxi-platform
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 105s (x11490 over 39d) default-scheduler 0/8 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling.
vsphere-csi-controller
is crashlooping. vsphere-csi-controller
container in vsphere-csi-controller
pod is in error state.k describe pod vsphere-csi-controller-75f8894c79-sdhck -n vmware-system-csi
--- Output truncated ---
vsphere-csi-controller:
Container ID: cri-o://89a90cba83dd8aa271edda13c9367fcf4ffe8402889c7377db127d90f31caf21
Image: sspi.example.org/registry/1.6.3/cloud-provider-vsphere/csi/release/driver:v3.1.0
Image ID: sspi.example.org/registry/1.6.3/cloud-provider-vsphere/csi/release/driver@sha256:af8887fde54bb0b8c44e821597cbb3b8087e4451b09bb2861d7ac67c66808775
Ports: 9808/TCP, 2112/TCP
Host Ports: 0/TCP, 0/TCP
Args:
--fss-name=internal-feature-states.csi.vsphere.vmware.com
--fss-namespace=$(CSI_NAMESPACE)
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 27 May 2025 20:15:21 +0000
Finished: Tue, 27 May 2025 20:15:21 +0000
Ready: False
Restart Count: 14106
Liveness: http-get http://:healthz/healthz delay=30s timeout=10s period=180s #success=1 #failure=3
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 55m (x14097 over 54d) kubelet Container image "sspi-prod.exapmple.org/registry/1.6.3/cloud-provider-vsphere/csi/release/driver:v3.1.0" already present on machine
Warning BackOff 37s (x339265 over 50d) kubelet Back-off restarting failed container vsphere-csi-controller in pod vsphere-csi-controller-75f8894c79-sdhck_vmware-system-csi(e369c968-57e9-4651-83fb-f4dbc8fc0e43)
vsphere-csi-controller
container logs will show this error,k logs vsphere-csi-controller-75f8894c79-sdhck -n vmware-system-csi -c vsphere-csi-controller
{"level":"error","time":"2025-05-27T20:10:18.974743694Z","caller":"vsphere/virtualcenter.go:672","msg":"failed to connect to VirtualCenter host: \"vc01.example.org\". Err: Post \"https://
vc01.example.org
:443/sdk\": host \"vc01.example.org
:443\" thumbprint does not match \"02:0E:FF:10:2F:0F:9A:A8:AA:77:D9:D0:27:2F:FA:EE:0A:67:24:6D\"","TraceId":"f2027a3e-d083-406e-ae50-49faaccbb6dc","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v3/pkg/common/cns-lib/vsphere.GetVirtualCenterInstanceForVCenterConfig\n\t/build/pkg/common/cns-lib/vsphere/virtualcenter.go:672\nsigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service/vanilla.(*controller).Init\n\t/build/pkg/csi/service/vanilla/controller.go:236\nsigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service.(*vsphereCSIDriver).BeforeServe\n\t/build/pkg/csi/service/driver.go:188\nsigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service.(*vsphereCSIDriver).Run\n\t/build/pkg/csi/service/driver.go:202\nmain.main\n\t/build/cmd/vsphere-csi/main.go:96\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}
Failed to establish a connection to vCenter vc01.example.org. Unable to connect to the vCenter. Please check the network or click EDIT CONNECTION to update the vCenter settings.
SSP 5.0.0