- A VKS cluster deployment in a Tanzu Supervisor environment is stuck in provisioning
- Control Plane nodes for the cluster are provisioned and powered on, however worker nodes fail to power on
- Similar errors are seen when describing the cluster or reviewing cluster events:
Warning ProvisioningFailed ##m##s (x## over ##m) csi.vsphere.vmware.com_################################_########-####-####-####-############ failed to provision volume with StorageClass "<storage class>": rpc error: code = Internal desc = failed to create volume. Error: POST "/vsanHealth": 503 Service Unavailable
Normal ExternalProvisioning ##m##s (x## over ##m) persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'csi.vsphere.vmware.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
The vSAN health service is unreachable or unavailable and the guest cluster cannot provision storage for the worker nodes.
Validate that the vSAN Health Service on the vCenter is started and healthy using the following steps:
- Open a ssh session to the vCenter appliance
- Run the following command to get the status of the vsan-health serviceservice-control --status vmware-vsan-health
- If the status returned from the above command is "stopped", run the following command to start the service:service-control --start vmware-vsan-health
- If this command fails, please open a case with Broadcom support
If the status returned is "started", use the following KB for additional troubleshooting guidance:
Troubleshooting vSphere Supervisor Workload Cluster VIP Connection Issues
See the following documentation for additional guidance:
Troubleshoot VKS Cluster Provisioning Errors
Check vCenter Server essential service status and dependencies