After a Kubernetes worker node is re-created (for example, by the BOSH resurrector), pods scheduled on the new node fail to start and remain in ContainerCreating or CreateContainerError state.
Pod events and node logs show repeated mount-related errors similar to:
MountVolume.SetUp failed for volume "<pvc-id>": rpc error: code = Internal desc = could not check if the target path (...) is a directory: permission denied
From the worker node logs:
containerd
failed to stat "/var/vcap/data/kubelet/pods/<podUID>/volumes/kubernetes.io~csi/<pvc-id>/mount": permission denied
kubelet
CreateContainerError: failed to generate container spec: failed to stat ".../kubernetes.io~csi/<pvc-id>/mount": permission denied
The issue only affects pods scheduled on the newly recreated worker node. Pods running on existing workers continue to operate normally.
Kubernetes cluster using Trident CSI with NFS-backed PersistentVolumes
The issue is caused by a node-local filesystem permission problem affecting the CSI volume target path under the kubelet directory on the newly recreated worker node.
Specifically:
This indicates the failure occurs on the worker node filesystem, prior to volume mount, rather than during disk attachment or NFS export access.
Recreating the affected PersistentVolumeClaim (PVC) resolves the issue.
Recreating the PVC causes Kubernetes to:
After PVC recreation, pods start successfully without further intervention.