Symptom - 1: Policy Recommendations stuck in Queued for discovery
Symptom - 2: Backup and Restore job fails with error: "failed to start operation, please contact administrator"
infraclassifier or other job stuck in pending state and vsphere-csi-controller in CrashLoopBackOff.root (if 5.0) or sysadmin (if 5.1) and run below,k get pods -A | egrep -vi "runn|compl"NAMESPACE NAME READY STATUS RESTARTS AGEnsxi-platform anomalydetectionstreamingjob-3bf6e796f3c9518d-exec-1 0/1 Pending 0 6d3hnsxi-platform infraclassifier-8646eb9646a0565e-exec-1 0/1 Pending 0 39dnsxi-platform infraclassifier-8646eb9646a0565e-exec-2 0/1 Pending 0 39dnsxi-platform infraclassifier-8646eb9646a0565e-exec-3 0/1 Pending 0 39dnsxi-platform infraclassifier-8646eb9646a0565e-exec-4 0/1 Pending 0 39dvmware-system-csi vsphere-csi-controller-75f8894c79-sdhck 6/7 CrashLoopBackOff 14105 (4m33s ago) 54d
infraclassifier pod shows below error,k describe pod infraclassifier-8646eb9646a0565e-exec-1 -n nsxi-platform
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 105s (x11490 over 39d) default-scheduler 0/8 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/8 nodes are available: 8 Preemption is not helpful for scheduling.
vsphere-csi-controller is crashlooping. vsphere-csi-controller container in vsphere-csi-controller pod is in error state.k describe pod vsphere-csi-controller-75f8894c79-sdhck -n vmware-system-csi--- Output truncated --- vsphere-csi-controller: Container ID: cri-o://89a90cba83dd8aa271edda13c9367fcf4ffe8402889c7377db127d90f31caf21 Image: sspi.example.org/registry/1.6.3/cloud-provider-vsphere/csi/release/driver:v3.1.0 Image ID: sspi.example.org/registry/1.6.3/cloud-provider-vsphere/csi/release/driver@sha256:af8887fde54bb0b8c44e821597cbb3b8087e4451b09bb2861d7ac67c66808775 Ports: 9808/TCP, 2112/TCP Host Ports: 0/TCP, 0/TCP Args: --fss-name=internal-feature-states.csi.vsphere.vmware.com --fss-namespace=$(CSI_NAMESPACE) State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Tue, 27 May 2025 20:15:21 +0000 Finished: Tue, 27 May 2025 20:15:21 +0000 Ready: False Restart Count: 14106 Liveness: http-get http://:healthz/healthz delay=30s timeout=10s period=180s #success=1 #failure=3Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulled 55m (x14097 over 54d) kubelet Container image "sspi-prod.exapmple.org/registry/1.6.3/cloud-provider-vsphere/csi/release/driver:v3.1.0" already present on machine Warning BackOff 37s (x339265 over 50d) kubelet Back-off restarting failed container vsphere-csi-controller in pod vsphere-csi-controller-75f8894c79-sdhck_vmware-system-csi(e369c968-57e9-4651-83fb-f4dbc8fc0e43)vsphere-csi-controller container logs will show this error,k logs vsphere-csi-controller-75f8894c79-sdhck -n vmware-system-csi -c vsphere-csi-controller
{"level":"error","time":"2025-05-27T20:10:18.974743694Z","caller":"vsphere/virtualcenter.go:672","msg":"failed to connect to VirtualCenter host: \"vc01.example.org\". Err: Post \"https://vc01.example.org:443/sdk\": host \"vc01.example.org:443\" thumbprint does not match \"02:0E:FF:10:2F:0F:9A:A8:AA:77:D9:D0:27:2F:FA:EE:0A:67:24:6D\"","TraceId":"f2027a3e-d083-406e-ae50-49faaccbb6dc","stacktrace":"sigs.k8s.io/vsphere-csi-driver/v3/pkg/common/cns-lib/vsphere.GetVirtualCenterInstanceForVCenterConfig\n\t/build/pkg/common/cns-lib/vsphere/virtualcenter.go:672\nsigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service/vanilla.(*controller).Init\n\t/build/pkg/csi/service/vanilla/controller.go:236\nsigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service.(*vsphereCSIDriver).BeforeServe\n\t/build/pkg/csi/service/driver.go:188\nsigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service.(*vsphereCSIDriver).Run\n\t/build/pkg/csi/service/driver.go:202\nmain.main\n\t/build/cmd/vsphere-csi/main.go:96\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}
Failed to establish a connection to vCenter vc01.example.org. Unable to connect to the vCenter. Please check the network or click EDIT CONNECTION to update the vCenter settings.SSP 5.0, SSP 5.1
To add vcenter certificate, it depends on the certificate authority (CA) structure used to sign the vCenter certificate. Identify which case applies to your environment and follow the corresponding steps.
Use this case when the vCenter certificate is signed directly by a root CA (no intermediate certificates in the chain).
Use this case when the vCenter certificate is signed by an intermediate CA that chains up to a root CA. You must upload the full certificate chain in the correct order.
Step 1 — Export the Full Certificate Chain from the Browser
Step 2 — Verify Certificate Chain Order
Before uploading, confirm the exported file contains the certificates in the correct order. Open the file in a text editor and verify the sequence is:
Note: If the order is incorrect, rearrange the PEM blocks manually so that the chain reads from server to root (top to bottom). SSP requires this order to correctly validate the full trust chain.
Step 3 — Upload to SSP
Ex:
k get nodes -o wide
k get pods -A