CSI provisioner logs reveal failure to list the CVORs due to API server timeout. For example:
[YYYY-MM-DDTHH:MM:SS]\\\",\\\"caller\\\":\\\"cnsvolumeoperationrequest/cnsvolumeoperationrequest.go:328\\\",\\\"msg\\\":\\\"failed to list CnsVolumeOperationRequests with error the server was unable to return a response in the time allotted, but may still be processing the request (get cnsvolumeoperationrequests.cns.vmware.com).
kubectl get cnsvolumeoperationrequests.cns.vmware.com -n vmware-system-csivSphere CSI Driver
The root cause of this issue lies in the CSI controller’s behavior during restarts. When the CSI controller restarts, it inadvertently resets the internal cleanup timer for stale CnsVolumeOperationRequest instances. The default cleanup interval is 1440 minutes (24 hours), meaning the system defers resource purging for an entire day, even after a successful volume operation has completed.
Broadcom engineering is aware of the issue and is working on a permanent fix to be included in future CSI driver releases. In the meantime, the following step by step cleanup instructions can be used as a workaround to safely delete successfully completed CVOR resources and alleviate the load.
Step 1: Scale Down the vSphere CSI Driver.
To ensure no new cnsvolumeoperationrequests are generated during the cleanup, scale down the CSI driver deployment
kubectl scale deployment vsphere-csi-controller -n vmware-system-csi --replicas=0
Step 2: Identify Completed CNSVolumeOperationRequests.
kubectl get cnsvolumeoperationrequests.cns.vmware.com -n vmware-system-csi -o jsonpath='{range .items[?(@.status.latestOperationDetails[0].taskStatus=="Success")]}{.metadata.name}{"\n"}{end}'
Example Output:pvc-xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
Step 3: Delete Completed Requests.
You can delete these manually using.
kubectl delete cnsvolumeoperationrequests.cns.vmware.com <request-name> -n vmware-system-csi
OR
Use the following command to automate the deletion of all requests with status "Success".
kubectl get cnsvolumeoperationrequests.cns.vmware.com -n vmware-system-csi -o jsonpath='{range .items[?(@.status.latestOperationDetails[0].taskStatus=="Success")]}{.metadata.name}{"\n"}{end}'|xargs kubectl delete cnsvolumeoperationrequests.cns.vmware.com -n vmware-system-csi
Example Output:cnsvolumeoperationrequest.cns.vmware.com "pvc-xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx" deleted
Step 4: Scale the CSI Driver Back Up.
After the cleanup is completed, restore the CSI driver to its normal state.kubectl scale deployment vsphere-csi-controller -n vmware-system-csi --replicas=3
Note:
This procedure is safe for removing only those requests with task Status set to Success. Requests with InProgress or Error status should be handled with caution as they may still be under processing or require additional investigation.
The buildup of CnsVolumeOperationRequest resources is a known scalability issue in the current CSI driver implementation. While the official fix is awaited, this manual cleanup approach can significantly reduce pressure on the Kubernetes API server and etcd, preventing crashes and performance degradation. Regular monitoring and scheduled cleanups (where feasible) are recommended in high-volume environments.