An existing pod is restarting on a new Worker node but is failing as it cant attach the persistent volume.
It fails with "the resource volume is in use" as outlined below.
Warning FailedAttachVolume 10s (x174 over 3m39s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-########-d570-####-####-############" : rpc error: code = Internal desc = failed to attach disk: "########-7ba5-####-####-###########" with node: "########-895f-####-####-########" err failed to attach cns volume: "########-7ba5-####-####-############" to node vm: "VirtualMachine:vm-1 [VirtualCenterHost: host-1, UUID: ########-98ea-####-####-############, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-1, VirtualCenterHost: host-1]]". fault: "(*types.LocalizedMethodFault)(0xc000e863e0)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (*types.ResourceInUse)(0xc000e91040)({\n VimFault: (types.VimFault) {\n MethodFault: (types.MethodFault) {\n FaultCause: (*types.LocalizedMethodFault)(<nil>),\n FaultMessage: (]types.LocalizableMessage) <nil>\n }\n },\n Type: (string) \"\",\n Name: (string) (len=6) \"volume\"\n }),\n LocalizedMessage: (string) (len=32) \"The resource 'volume' is in use.\"\n})\n"
TKGi with CSI volumes
There are 2 volumeattachments for the PV, one showing that the PV is already attached to another node.
# kubectl get volumeattachment | grep pvc-########-d570-####-####-##########
NAME ATTACHER PV NODE ATTACHED AGE
csi-3695######################### csi.vsphere.vmware.com pvc-########-d570-####-####-############ ########-6da8-####-####-############ true 100d
csi-136c######################### csi.vsphere.vmware.com pvc-########-d570-####-####-############ ########-895f-####-####-############ false 10m
The csi-attacher shows that it is failing to detach the volume as the VM is disconnected:
Error processing "csi-3695#########################": failed to detach: rpc error: code = Internal desc = failed to detach disk: "########-7ba5-####-####-############" from node: "########-6daf-####-####-############" err failed to detach cns volume: "########-7ba5-####-####-############" from node vm: VirtualMachine:vm-2 [VirtualCenterHost: host-2, UUID: #########-6da8-####-####-############, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-1, VirtualCenterHost: host-2]]. fault: (*types.LocalizedMethodFault)(0xc000c8c760)({
DynamicData: (types.DynamicData) {
},
Fault: (*types.HostNotConnected)(0xc000c8c7a0)({
HostCommunication: (types.HostCommunication) {
RuntimeFault: (types.RuntimeFault) {
MethodFault: (types.MethodFault) {
FaultCause: (*types.LocalizedMethodFault)(<nil>),
FaultMessage: ([]types.LocalizableMessage) <nil>
}
}
}
}),
LocalizedMessage: (string) (len=69) "Unable to communicate with the remote host, since it is disconnected."
})
, opId: "08e6ac14"
On vSphere UI, the VM is in a disconnected state. All other VMs on the ESX Host are in disconnected state and ESX host is not in a healthy state.
Engage ESX team to identify why ESX Host is in faulty state and VMs are disconnected