PV fails to attach to Pod with error "AttachVolume.Attach failed for volume "pvc-xxxxxxx" : Failed to add disk 'scsix:y'.

search cancel

PV fails to attach to Pod with error "AttachVolume.Attach failed for volume "pvc-xxxxxxx" : Failed to add disk 'scsix:y'.

book

Article ID: 298640

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

A Pod is failing to start as it cant attach the volume. The following events are observed for the Pod.
kubectl describe pod <POD>

Warning FailedMount 2m44s kubelet, 05649f56-2e0a-4f5f-8c48-44cf0624d5fa Unable to attach or mount volumes: unmounted volumes=[XXXXXX], unattached volumes=[YYYYYY]: timed out waiting for the condition 
Warning FailedAttachVolume 2m43s (x3 over 9m45s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-xxxxxxx" : Failed to add disk 'scsi1:1'.

The persistent volume is attached to another Worker node and cannot be attached to a Pod on a different Worker.

This can occur in a number of different scenarios:

Manual poweroff of the Worker VM
Worker VM goes offline
Connectivity to PV Storage interrupted
Resource pressure on Worker VM

This is a known Kubernetes issue and is discussed in Github issues:

https://github.com/kubernetes/kubernetes/issues/80040
https://github.com/kubernetes/kubernetes/issues/75738

When a PV is attached to a Pod, it is also attached to the Worker with a lock on the volume. In the scenarios outlined above, this lock does not always get removed and this prevents the volume from being attached to other Worker nodes.

Environment

Product Version: 1.8

Resolution

To resolve the lock and detach the volume from the old Worker VM, recreate the old Worker VM.

Identify the old Worker VM. This can be done from ESXi or Kube Controller Manager logs.

For ESXi method, please reference this article to identify the Worker VM where volume is attached, https://kb.vmware.com/s/article/10051.

Alternatively review the kube-controller-manager logs on Master VMs to retrieve Kubernetes Node ID and old Worker VM IP

grep <PVC-ID> /var/vcap/sys/log/kube-controller-manager/kube-controller-manager.stderr* | grep "Volume is already exclusively attached to node"

kubectl get nodes -o wide |grep <NODE ID>

Recreate the old VM, this will release the attachment

bosh vms | grep <IP of old VM>
bosh -d <DEPLOYMENT> recreate worker/<ID>

The failing Pods should now start and attach the PVs on the new Worker VM

Feedback

thumb_up Yes

thumb_down No