Pod Stuck in ContainerCreating/Pending State with MountVolume.SetUp "DeadlineExceeded" Due to DNS Failure
search cancel

Pod Stuck in ContainerCreating/Pending State with MountVolume.SetUp "DeadlineExceeded" Due to DNS Failure

book

Article ID: 425093

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Management

Issue/Introduction

Following a Pod restart or migration, Persistent Volume Claims (PVC) fail to mount to the Pod. The Pod remains in a non-Running state (typically ContainerCreating or Pending), and the storage remains inaccessible.

We see similar log snippet while describing pods:

Warning FailedMount 36m (x2 over 48m) kubelet MountVolume.SetUp failed for volume "pvc-12345-6789" : rpc error: code = DeadlineExceeded desc = context deadline exceeded

Warning FailedMount 16m (x3 over 57m) kubelet MountVolume.SetUp failed for volume "pvc-12345-6789" : rpc error: code = DeadlineExceeded desc = context deadline exceeded

Environment

TKGM 2.5.1

Cause

  • The issue is caused by a failure in DNS resolution or routing.
  • The Kubelet and CSI driver attempts to locate and communicate with the storage provider/backend using a hostname or FQDN.
  • Due to a recent DNS update or incorrect DNS IP configuration, the storage communication path is broken.
  • The Deadline Exceeded error occurs when the CSI RPC call times out while waiting for a network response that never arrives because the endpoint cannot be resolved or reached.

Resolution

  • Verify if any DNS server IPs or routing tables were updated prior to the mounting failures.
  • Verify DNS Connectivity: From a worker node, attempt to resolve the storage endpoint FQDN:
    nslookup <storage-provider-hostname>
  • Revert DNS Configuration: Revert the DNS settings to the previously known-working DNS server IPs.
  • Update the /etc/resolv.conf or the node customization scripts/TCA templates if applicable.
  • Restart Kubelet (Optional): If the mount does not automatically retry successfully after the DNS fix, restart the Kubelet on the affected worker node:
    systemctl restart kubelet
  • Verify Mount: Monitor the Pod events to ensure the volume setup succeeds:
    kubectl get events --watch