vsphere-syncer in CrashLoopBackOff with "panic: runtime error: invalid memory address or nil pointer dereference"
search cancel

vsphere-syncer in CrashLoopBackOff with "panic: runtime error: invalid memory address or nil pointer dereference"

book

Article ID: 387809

calendar_today

Updated On:

Products

VMware vSphere with Tanzu

Issue/Introduction

The vsphere-csi-controller pod will repeatedly enter a CrashLoopBackOff state.

NAMESPACE             POD NAME                                         READY   STATUS             RESTARTS       AGE
vmware-system-csi     vsphere-csi-controller-xxxxxxx-xxxx          6/7     CrashLoopBackOff   2689 (60s ago)  47d
vmware-system-csi     vsphere-csi-controller-xxxxxxx-xxxx          7/7     Running            2743 (6m10s ago) 83d
vmware-system-csi     vsphere-csi-controller-xxxxxxx-xxxx          7/7     Running            2731 (14m ago)  84d

Describing the pod returns the following errors:

Warning  BackOff  39s  kubelet  Back-off restarting failed container vsphere-syncer in pod vsphere-csi-controller-xxxxxx-xxxxx

syncer.log indicates a segmentation fault due to an invalid memory access or nil pointer dereference:

2025-01-29T02:56:27.780896339Z stderr F {"level":"info","time":"2025-01-29T02:56:27.780808672Z","caller":"wcp/controller.go:910","msg":"CreateVolume: called with args {Name:pvc-36863e7a-bc0a-43d3-xxxx-xxxxxxxx CapacityRange:required_bytes:5368709120 ..."}
2025-01-29T02:56:27.780921537Z stderr F {"level":"error","time":"2025-01-29T02:56:27.78085394Z","caller":"wcp/controller.go:940","msg":"file volume provisioning is not supported on a stretched supervisor cluster"}

Error messages in the driver logs related to file volume provisioning:

2025-01-29T02:56:27.780921537Z stderr F {"level":"error","time":"2025-01-29T02:56:27.78085394Z","caller":"wcp/controller.go:940","msg":"File volume provisioning is not supported on a stretched supervisor cluster",
    "TraceId": "fd23ed0f-59c2-423d-baba-49b26ed596ec",
    "stacktrace": "sigs.k8s.io/vsphere-csi-driver/v3/pkg/csi/service/wcp."
}

validate that customer is using stretched/Three-zone supervisor clusters via the vCenter logs.

commands/wcp-db-dump.py.txt:

Cluster: domain-cxxxx:4f57dd38-821a-401e-ad26-2231aeb540b3
Instance ID: b027b027-663b-440c-9b0a-e9f8bf75a99c
State: APPLY
Upgrade State: READY
Desired Version: v1.27.5+vmware.1-fips.1-vsc0.1.10-24224934
Create Timestamp: 1668733274
Last Update Timestamp: 1729560194
Upgrade Start Timestamp: 1729555817

Desired Config:
  Name: multi-zone-supervisor
  Deployment Target:
    Fault Domain Zones:
      - Zone ID: zone2
        Cluster Compute Resources:
          - Type: ClusterComputeResource
            Value: domain-c1370
      - Zone ID: zone1
        Cluster Compute Resources:
          - Type: ClusterComputeResource
            Value: domain-c1367
      - Zone ID: zone3
        Cluster Compute Resources:
          - Type: ClusterComputeResource
            Value: domain-c1374

Environment

VMware vSphere with Tanzu

Cause

File volume provisioning is not supported on a stretched supervisor cluster

Supervisor Storage

Resolution

Delete the PV and scale down and up the CSI controllers

kubectl delete pv <pv-name>

kubectl scale deployment vsphere-csi-controller --replicas=0 -n vmware-system-csi
kubectl scale deployment vsphere-csi-controller --replicas=3 -n vmware-system-csi