The vsphere-csi-controller
pod will repeatedly enter a CrashLoopBackOff
state.
NAMESPACE POD NAME READY STATUS RESTARTS AGE
vmware-system-csi vsphere-csi-controller-xxxxxxx-xxxx 6/7 CrashLoopBackOff 2689 (60s ago) 47d
vmware-system-csi vsphere-csi-controller-xxxxxxx-xxxx 7/7 Running 2743 (6m10s ago) 83d
vmware-system-csi vsphere-csi-controller-xxxxxxx-xxxx 7/7 Running 2731 (14m ago) 84d
Describing the pod returns the following errors:
Warning BackOff 39s kubelet Back-off restarting failed container vsphere-syncer in pod vsphere-csi-controller-xxxxxx-xxxxx
syncer.log indicates a segmentation fault due to an invalid memory access or nil pointer dereference:
2025-01-29T02:56:27.780896339Z stderr F {"level":"info","time":"2025-01-29T02:56:27.780808672Z","caller":"wcp/controller.go:910","msg":"CreateVolume: called with args {Name:pvc-36863e7a-bc0a-43d3-xxxx-xxxxxxxx CapacityRange:required_bytes:5368709120 ..."}
2025-01-29T02:56:27.780921537Z stderr F {"level":"error","time":"2025-01-29T02:56:27.78085394Z","caller":"wcp/controller.go:940","msg":"file volume provisioning is not supported on a stretched supervisor cluster"}
Error messages in the driver logs related to file volume provisioning:
2025-01-29T02:56:27.780921537Z stderr F {"level":"error","time":"2025-01-29T02:56:27.78085394Z","caller":"wcp/controller.go:940","msg":"File volume provisioning is not supported on a stretched supervisor cluster",
"TraceId": "fd23ed0f-59c2-423d-baba-49b26ed596ec",
"stacktrace": "sigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service/wcp."
}
validate that customer is using stretched/Three-zone supervisor clusters via the vCenter logs.
commands/wcp-db-dump.py.txt:
Cluster: domain-cxxxx:4f57dd38-821a-401e-ad26-2231aeb540b3
Instance ID: b027b027-663b-440c-9b0a-e9f8bf75a99c
State: APPLY
Upgrade State: READY
Desired Version: v1.27.5+vmware.1-fips.1-vsc0.1.10-24224934
Create Timestamp: 1668733274
Last Update Timestamp: 1729560194
Upgrade Start Timestamp: 1729555817
Desired Config:
Name: multi-zone-supervisor
Deployment Target:
Fault Domain Zones:
- Zone ID: zone2
Cluster Compute Resources:
- Type: ClusterComputeResource
Value: domain-c1370
- Zone ID: zone1
Cluster Compute Resources:
- Type: ClusterComputeResource
Value: domain-c1367
- Zone ID: zone3
Cluster Compute Resources:
- Type: ClusterComputeResource
Value: domain-c1374
VMware vSphere with Tanzu
File volume provisioning is not supported on a stretched supervisor cluster
Delete the PV and scale down and up the CSI controllers
kubectl delete pv <pv-name>
kubectl scale deployment vsphere-csi-controller --replicas=0 -n vmware-system-csi
kubectl scale deployment vsphere-csi-controller --replicas=3 -n vmware-system-csi