Pods stuck in ContainerCreating status with Volume "does not appear staged" error

Products

Tanzu Kubernetes Runtime Tanzu Kubernetes Grid VMware Tanzu Kubernetes Grid VMware Tanzu Kubernetes Grid 1.x VMware Tanzu Kubernetes Grid Management VMware Tanzu Kubernetes Grid Plus VMware Tanzu Kubernetes Grid Plus 1.x

Issue/Introduction

Pods get stuck in ContainerCreating status with the following events:

Warning FailedMount 1s kubelet MountVolume.SetUp failed for volume "pvc-<>" : rpc error: code = FailedPrecondition desc = volume ID: "<volume-id>" does not appear staged to "/var/lib/kubelet/plugins/kubernetes.io/csi/csi.vsphere.vmware.com/<>/globalmount"

Environment

The issue was observed with PVs managed by a csi.vsphere.vmware.com StorageClass with reclaimPolicy set to Retain and volumeBindingMode set to Immediate:

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: retain-sc
provisioner: csi.vsphere.vmware.com
reclaimPolicy: Retain
volumeBindingMode: Immediate

Cause

Something wrong may have occurred with the CSI components when attaching the volume to the nodes where the pods are scheduled.

If a VolumeAttachment object exists but no respective volume mount exists in the node, kubelet will fail to mount the volume into the pods and CSI won't try to mount the volume in the node as it may think it's already mounted due to the existence of the VolumeAttachment object.

Resolution

General troubleshooting

Check PV, PVC and VolumeAttachment objects. The VolumeAttachment object is created automatically by CSI once a Pod mounting the PVC is created. When all Pods mounting the PVC are deleted, the VolumeAttachment object is also deleted automatically.

# kubectl get pv,pvc,volumeattachment -n <namespace>

For example:

$ kubectl get pv,pvc,volumeattachment
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70 1Gi RWO Retain Bound default/vsphere-csi-pvc retain-sc 2d3h

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/vsphere-csi-pvc Bound pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70 1Gi RWO retain-sc 2d

NAME ATTACHER PV NODE ATTACHED AGE
volumeattachment.storage.k8s.io/csi-735edc96e352528f89c7344b45a422fdb4b568c6d4a43368320b84351bae4e1a csi.vsphere.vmware.com pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70 workload-<> true 3m53s

Check CSI pods are up and running:

# kubectl get po -n vmware-system-csi

Log into the nodes where the pods are scheduled and check volume mounts there:

# df -h | grep <pv-name>

Check the PV is correctly mounted in "/var/lib/kubelet/pods/<pod-uid>/volumes/kubernetes.io~csi/".
The <pod-uid> can be obtained by describing the pod (kubectl describe pod) and checking the UUID.

If the VolumeAttachment object exists but there's no volume mount in the node, that may indicate an issue.
Additionally, if "/var/lib/kubelet/pods/<pod-uid>/volumes/kubernetes.io~csi/" is empty, that may also indicate an issue.

For example, grepping the above PV's name doesn't return anything despite the existence of an associated VolumeAttachment object, and the directory is empty:

# df -h | grep pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70
#
# ls -lrt /var/lib/kubelet/pods/b1e6c69f-745d-46ab-8b2c-50d55982032f/volumes/kubernetes.io~csi/
total 0

Check kubelet logs in the node. We may see errors trying to mount the volume into the Pod, similar to the ones in the Pod's events.

# journalctl -u kubelet | grep <pod-name>

Check if there're more Pods mounting the PVC:

# kubectl get po -A -o jsonpath='{range .items[]}{.metadata.namespace}{"\t"}{.metadata.name}{"\t"}{.spec.volumes[].persistentVolumeClaim.claimName}{"\n"}{end}' | grep <pvc-name> | awk '{print $1 " " $2}'

Resolution

Possible resolution steps include:

Scale in to 0 replicas all the Deployments/StatefulSets mounting the problematic PVC, as found in the previous step:

# kubectl scale deploy <deployment-name> -n <namespace> --replicas=0
# kubectl scale sts <statefulset-name> -n <namespace> --replicas=0

Check again the VolumeAttachment object. The one associated to the PVC should be automatically deleted in a few seconds.
The PV/PVC objects should remain in Bound status.

# kubectl get pv,pvc,volumeattachment -n <namespace>

For example:

# kubectl get pv,pvc,volumeattachment
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70 1Gi RWO Retain Bound default/vsphere-csi-pvc retain-sc 2d3h

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/vsphere-csi-pvc Bound pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70 1Gi RWO retain-sc 2d

If the VolumeAttachment object doesn't get deleted automatically after a couple of minutes, delete it manually:

# kubectl delete volumeattachment <volumeattachment-name>

Make sure it doesn't get recreated. If it gets recreated, most likely there's still some Pod mounting the PVC. Look for it and scale down the associated Deployment/StatefulSet.

Scale back out the Deployments/StatefulSets:

# kubectl scale deploy <deployment-name> -n <namespace> --replicas=<original-number-of-replicas>
# kubectl scale sts <statefulset-name> -n <namespace> --replicas=<original-number-of-replicas>

We should see now the Pods going into Running status and a new VolumeAttachment object being created.

For example:

# kubectl get po,pv,pvc,volumeattachment
NAME READY STATUS RESTARTS AGE
pod/pvc-test-deployment-6865f74d8-kg65b 1/1 Running 0 11m

NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70 1Gi RWO Retain Bound default/vsphere-csi-pvc retain-sc 2d4h

NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/vsphere-csi-pvc Bound pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70 1Gi RWO retain-sc 2d

NAME ATTACHER PV NODE ATTACHED AGE
volumeattachment.storage.k8s.io/csi-735edc96e352528f89c7344b45a422fdb4b568c6d4a43368320b84351bae4e1a csi.vsphere.vmware.com pvc-f900d7c2-a318-4bbd-80c8-3798ecab1b70 workload-<> true 11m