vSphere Supervisor Workload Cluster Pod Stuck Init or ContainerCreating due to volume mount fails with 'mount.nfs4: mounting <fs>:/vsanfs/<uuid> failed, reason given by server: No such file or directory'
search cancel

vSphere Supervisor Workload Cluster Pod Stuck Init or ContainerCreating due to volume mount fails with 'mount.nfs4: mounting <fs>:/vsanfs/<uuid> failed, reason given by server: No such file or directory'

book

Article ID: 399782

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service Tanzu Kubernetes Runtime

Issue/Introduction

In a vSphere Supervisor Workload Cluster, a pod is stuck ContainerCreating or Init state.

Note: This KB article is written from the perspective of using vSAN File Shares in a vSphere Supervisor environment.

 

While connected to the affected Workload Cluster context, one or more of the following symptoms are observed:

  • One or more pods stuck Init or stuck ContainerCreating state:
    • kubectl get pods -n <pod namespace> -o wide  
    • NAMESPACE          NAME             READY            STATUS
      <pod namespace> <pod name-a> <container/count> Init:0/1
      <pod namespace> <pod name-b> <container/count> ContainerCreating

  • Describing the problematic pod returns an output similar to the following with exit status 32:
    • kubectl describe pod -n <pod namespace> <pod name>
    • Events:
      Type Reason Age From Message
      ---- ------ ---- ------- ------------
      Warning FailedMount ##s kubelet MountVolume.SetUp failed for volume "<pv name>" : rpc error: code = Internal desc = error publish volume to target path: mount failed: exit status 32
      mounting arguments: -t nfs4 -o hard,sec=sys,vers=#,minorversion=# <fileshare path> /var/lib/kubelet/pods/<id>/volumes/kubernetes.io~csi/<pv name>/mount
      output: mount.nfs4: mounting <fileshare path> failed, reason given by server: No such file or directory

  • Manual mount commands to the same fileshare path fail.

  • There is no improvement from restarting the problematic pod.

  • The vsphere-csi system pods are not failing in the affected workload cluster:
    • kubectl get pods -n vmware-system-csi

 

While connected to the Supervisor cluster context, the following symptoms are present:

  • The vsphere-csi system pods in the Supervisor cluster are not failing.

 

When viewing the affected vSAN File Share(s) in the vSphere web UI, Net Access Control shows the following message:

  • In the vSphere web UI, navigate to the affected Cluster in the Inventory -> Configure -> vSAN -> File shares
    • Change the Share type filter dropdown to Container File Volume and use the Name column's filter icon to search for the File share by its ID from the problematic pod's error message.
  • Net access control No one can access the file share object.

Environment

vSphere with Tanzu 7.0

vSphere with Tanzu 8.0

Cause

This issue is caused by stale CNS resources for volumes and fileshares.

Stale CNS resources can be caused by manual deletion of nodes without allowing the system to properly drain pods or detach volumes from the deleting node and update the CNS resources accordingly.

In some cases, the volume may remain attached to another node due to missing ACLs in Net Access Control (“No one can access the file share object”), preventing the system from detaching it properly.

IMPORTANT: It is not an appropriate troubleshooting practice to manually delete nodes.

System-initiated deletions such as through a rolling redeployment or upgrade will perform a drain on the deleting node and detach volumes from the node.

However, if the node is manually deleted before the system completes draining and detaching volumes from the deleted node, this will result in stale resources such as the ones that will be advised regarding in this KB.

Rolling redeployments and upgrades will create a new node with the desired changes/version first and await health checks before moving onto deletion of an older node.

A common cause of nodes stuck deleting is due to Pod Disruption Budgets (PDBs):

 

A common cause of nodes failing to create is due to third party webhooks or unhealthy control plane nodes in the affected workload cluster:

Resolution

The corresponding stale CNS resources will need to be manually corrected to reflect the current state of the environment.

Warning: Deleting PVCs will delete the corresponding vSAN fileshare and its data.

 

Initial Checks

  1. While connected to the workload cluster context, locate the problematic pod and note down the mount failure error:
    • kubectl get pod -o wide -n <pod namespace>
      
      kubectl describe pod -n <pod namespace> <pod name>

       

    • mounting arguments: -t nfs4 -o hard,sec=sys,vers=#,minorversion=# <fileshare path> /var/lib/kubelet/pods/<id>/volumes/kubernetes.io~csi/<pv name>/mount
    •  Note down the following information:
      • The node that this pod is attempting to start on
      • The fileshare path
      • The pv name

  2. Note down the names of all PersistentVolumeClaims associated with the problematic pod, where number of PVCs and values enclosed in brackets <> will vary by environment:
    • kubectl describe pod -n <pod namespace> <pod name>
    • Volumes:
      <volume-type-a>:
      Type: PersistentVolumeClaim
      ClaimName: <name of pvc-a>
      <volume-type-b>:
      Type: PersistentVolumeClaim
      ClaimName: <name of pvc-b>

  3. While connected directly to the workload cluster node where the problematic pod is attempting to run, check that the noted vSAN File Share is reachable from this node:
    • You can perform either a ping or a curl -vk to the noted File Share instance in the problematic pod's error message.

    • If the noted vSAN File share cannot be reached by this node, that is a separate issue than this KB article.
      • This will need to be investigated to resolution before following the rest of this KB.

  4.  Confirm if there are any other pods using the above noted PersistentVolumeClaim (pvc) from the error message and any of the volumes from describing the pod:
    • kubectl get pod -A  -o yaml | grep <name of pvc from step 2>
    •  If there are other pods in the workload cluster that are using the noted volume(s), please consult with the corresponding application owner on how to proceed.

  5. Find the persistent volume (pv) associated with the above pvc name from Step 2:
    • kubectl get pv | grep <pvc name from step 2>
    • NAME       CAPACITY   ACCESS MODES  RECLAIM POLICY            STATUS              CLAIM                     STORAGECLASS
      <pv name> # <RWO/RWX> <Retain/Recycle/Delete> <Bound/Pending> <pvc name from Step 2> <storageclass>
  6. Locate the associated volumeattachment and note down the node that it is associated with:
    • kubectl get volumeattachment -o wide | grep <pv name from previous step>
    • NAME                     ATTACHER                  PV          NODE          ATTACHED        AGE
      <volume attachment name> csi.vsphere.vmware.com <pv name> <node name> <True/False> <age>

      • The volumeattachment can be described to find its associated AccessPoint as well.


  7. Check if the corresponding node is present or not in the workload cluster:
    • kubectl get nodes
       
    • If either of the following are true:
      • There are two volumeattachments associated with two existing nodes for the same PV.  
      • The node that the volumeattachment is associated with exists in the workload cluster.
        • see VolumeAttachment on an Existing Node below
    • Otherwise, if the node does not exist, see CNSFile Checks further below

 

VolumeAttachment on an Existing Node

If there are two volumeattachments associated with two different existing nodes for one PV:

  1. If there are no other pods in the workload cluster that are using the problematic pod's intended volumes, consult with the application owner if it would be possible to temporarily scale down this problematic pod.

  2. Locate and temporarily scale down the problematic pod's corresponding managing kubernetes object:
    • Pods can be managed by different kubernetes objects depending on initial deployment factors. For example, pods can be deployed by: deployments, replicasets, daemonsets, statefulsets or jobs.

    • kubectl get <kubernetes object> -n <pod namespace>
      • For a deployment, the following command can be used:
        • kubectl scale deployment -n <deployment namespace> <deployment name> --replicas=0
  3. Delete the volumeattachment associated with the previous node:
    • kubectl get volumeattachment -A | grep <pv name>
      kubectl delete volumeattachment <volumeattachment/previous node> -n <namespace> 
  4. Confirm that there is not a corresponding cnsfileaccessconfig or cnsfilevolumeclient for the <previous node>
  5. If there is no associated cnsfileaccessfileconfig or cnsfilevolumeclient, then the problematic pod can be scaled back up in the workload cluster context.
    1. If there is an associated cnsfileaccessfileconfig or cnsfilevolumeclient, please see the CNSFile Checks section below

  6. After scaling the problematic pod back up, check that an associated volumeattachment was created and attached True successfully to the same node as the problematic pod:
    1. kubectl get volumeattachment -o wide | grep <pv name>
  7. Check that the problematic pod reaches Running state or if it encounters the same error message post-restart:
    • kubectl get pod -n <pod namespace> -o wide
      
      kubectl describe pod -n <pod namespace> <pod name>

 

Otherwise:

  1. If there are no other pods in the workload cluster that are using the problematic pod's intended volumes, consult with the application owner if it would be possible to temporarily scale down this problematic pod.

  2. Locate and temporarily scale down the problematic pod's corresponding managing kubernetes object:
    • Pods can be managed by different kubernetes objects depending on initial deployment factors. For example, pods can be deployed by: deployments, replicasets, daemonsets, statefulsets or jobs.

    • kubectl get <kubernetes object> -n <pod namespace>
      • For a deployment, the following command can be used:
        • kubectl scale deployment -n <deployment namespace> <deployment name> --replicas=0
  3. Confirm that the problematic pod was brought down successfully:
    • kubectl get pod -n <pod namespace>
  4. Check that the corresponding volumeattachment was cleaned up:
    • kubectl get volumeattachment -A | grep <pv name>
  5. Connect to the Supervisor cluster context

  6. Confirm that there is not a corresponding cnsfileaccessconfig or cnsfilevolumeclient present in the Supervisor cluster:
    • kubectl get cnsfileaccessconfig,cnsfilevolumeclient -n <workload cluster namespace> | grep <pv name>
    • The cnsfileaccessconfig and cnsfilevolumeclient will be recreated when the associated problematic pod is scaled back up.


  7. If there is no associated cnsfileaccessfileconfig or cnsfilevolumeclient, then the problematic pod can be scaled back up in the workload cluster context.
    • If there is an associated cnsfileaccessfileconfig or cnsfilevolumeclient, please see the CNSFile Checks section below

  8. After scaling the problematic pod back up, check that an associated volumeattachment was created and attached True successfully to the same node as the problematic pod:
    • kubectl get volumeattachment -o wide | grep <pv name>
  9. Check that the problematic pod reaches Running state or if it encounters the same error message post-restart:
    • kubectl get pod -n <pod namespace> -o wide
      
      kubectl describe pod -n <pod namespace> <pod name>

 

CNSFile Checks

  1. While in the Supervisor cluster context, the cnsfileaccessconfig and cnsfilevolumeclient objects should be checked to confirm if there are any stale entries.
    • Each cnsfileaccessconfig is named for the workload cluster node and the persistent volume that it is associated with. Describing this object provides details on:

      • The associated VirtualMachine name and UID
      • The PV name as viewed from the Supervisor Cluster
      • All Access Points (such as the Container File Volume name)
      • Events related to this object
    • Each cnsfilevolumeclient is named for the file and persistent volume that it is associated with. Describing this object shows details on:

      • externalIP to ClientVMs for one or more VMs
      • List of VMs this object is associated with

  2. Perform a describe on the cnsfileaccessconfig corresponding to the pv name from the problematic pod's error message:
    • kubectl get cnsfileaccessconfig -n <workload cluster context> | grep <pv name>
      
      kubectl describe cnsfileaccessconfig -n <workload cluster context> <cnsfileaccessconfig name>
  3. If the cnsfileaccessconfig has the following error message, confirm that the noted node is no longer present in the environment:
    • error: 'Failed to get virtualmachine instance for VM with name: "<missing workload cluster node name>".
      Error: virtualmachines.vmoperator.vmware.com "<missing workload cluster node name>" not found'
       
    • kubectl get vm,machine,vspheremachine -A | grep <missing workload cluster node name>

      • When a VM is deleted, it is expected to delete the corresponding cnsfileaccessconfig in order to allow the pv and accessPoints to be associated with another VM.


    • If the noted node does not exist in the environment, check if the cnsfileaccessconfig has a deletion timestamp:
      • kubectl describe cnsfileaccessconfig -n <workload cluster namespace> <cnsfileaccessconfig name> | head

         

      • Metadata:
        DeletionTimestamp: YYYY-MM-DD:HH:MM:SSZ
      • If there is a DeletionTimestamp and it has been confirmed that the associated VM does not exist in the environment as per the previous step, the cnsfileaccessconfig is stale.
        • You may edit the cnsfileaccessconfig to remove its Finalizer section:
          • CAUTION: Do not delete this object unless the corresponding VM is confirmed to no longer exist.
          • kubectl edit cnsfileaccessconfig -n <workload cluster context> <cnsfileaccessconfig name>
          • Finalizers:
            - cns.vmware.com

      • If there is not a DeletionTimestamp but the associated VM does not exist in the environment, the cnsfileaccessconfig can be considered stale and deleted manually:
        • kubectl get cnsfileaccessconfig -n <workload cluster namespace> <cnsfileaccessconfig name>
          
          kubectl delete cnsfileaccessconfig -n <workload cluster namespace> <cnsfileaccessconfig name>
    • Confirm that the stale cnsfileaccessconfig was cleaned up:
      • kubectl get cnsfileaccessconfig -n <workload cluster namespace> <cnsaccessfileconfig name>

         

  4. Check if there are any cnsfilevolumeclients associated with the persistent volume (pv) that have entries for a missing node:
    • kubectl get cnsfilevolumeclient -n <workload cluster namespace> -o yaml | grep <affected pv name>
    • Confirm if the listed nodes are present in the environment or missing:
      • kubectl get vm,machine,vspheremachine -A | grep <node name>
    • As necessary, edit the entries to remove the missing node entry:
      • Note: If there are only missing node entries in this cnsfilevolumeclient object, this cnsfilevolumeclient is stale and can be safely deleted:
      • kubectl edit cnsfilevolumeclient -n <workload cluster namespace> <cnsfilevolumeclient name>
      • spec:
        externalIPtoClientVMs:
        <IP address>:
        - <missing workload cluster node name>
        - <existing workload cluster node name>

  5. Connect into the workload cluster context

  6. Check if the corresponding volumeattachment(s) are attached to a missing node:
    • kubectl get volumeattachments -o wide | grep <pv name>
    • If a volumeattachment is attached to a missing node, it is considered stale and can be safely deleted:
      • kubectl delete volumeattachment <volumeattachment name>
  7. Recreate the problematic pod and confirm that a new volumeattachment is successfully created and attached True to the same node as the problematic pod:
    • kubectl get volumeattachments -o wide | grep <pv name>

       

  8. Confirm that the problematic pod is now in Running state:
    • kubectl get pod -n <pod namespace>