Storage vMotion extremely slow or timing out on specific ESXi hosts
search cancel

Storage vMotion extremely slow or timing out on specific ESXi hosts

book

Article ID: 393291

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Storage vMotion of virtual machines is extremely slow and may eventually time out.

         

  • If this is a cold migration (VMs are powered off during the process), the affected virtual machines cannot be powered on until the storage vMotion operation either completes or times out.

Environment

  • VMware vSphere ESXi 7.x
  • VMware vSphere ESXi 8.x

Cause

Based on log analysis and observed patterns, the issue appears to stem from a likely Fiber Channel (FC) cabling fault, faulty SFP, or interconnect issue.

Cause Validation

  • Review of the /var/run/log/vmkernel.log on the affected ESXi hosts shows “state in doubt” errors across all storage devices. SCSI sense codes observed: H:0x5, H:0x2, and H:0x8, indicating communication issues with the storage backend.

2025-03-20T11:34:51.093Z cpu45:2097646)brcmfcoe: lpfc_handle_status:5079: 1:(0):3271: FCP cmd x2a failed <0/1> sid x011801, did x012f01, oxid x9c5 iotag x4fe Time Out Returning Host Busy
2025-03-20T11:34:51.093Z cpu39:2098225)NMP: nmp_ThrottleLogForDevice:3867: Cmd 0x2a (0x45d92728de88, 16479068) to dev "naa.600a09803830454f572b4c57494a4269" on path "vmhba2:C0:T0:L1" Failed:
2025-03-20T11:34:51.093Z cpu39:2098225)NMP: nmp_ThrottleLogForDevice:3875: H:0x2 D:0x0 P:0x0 . Act:EVAL. cmdId.initiator=0x430794f34180 CmdSN 0x1ef8b87 >>This status is returned when the HBA driver is unable to issue a command to the device. This status can occur due to dropped FCP frames in the environment.
2025-03-20T11:34:51.093Z cpu39:2098225)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.600a09803830454f572b4c57494a4269" state in doubt; requested fast path state update...
2025-03-20T11:34:51.093Z cpu39:2098225)ScsiDeviceIO: 4124: Cmd(0x45d92728de88) 0x2a, CmdSN 0x1ef8b87 from world 16479068 to dev "naa.600a09803830454f572b4c57494a4269" failed H:0x2 D:0x0 P:0x0
2025-03-20T11:34:51.665Z cpu45:2097646)brcmfcoe: lpfc_handle_status:5079: 1:(0):3271: FCP cmd x2a failed <0/1> sid x011801, did x012f01, oxid x8f4 iotag x42d Abort Requested Host Abort Req
2025-03-20T11:34:51.665Z cpu39:2098225)NMP: nmp_ThrottleLogForDevice:3867: Cmd 0x2a (0x45d92bdfae88, 2097233) to dev "naa.600a09803830454f572b4c57494a4269" on path "vmhba2:C0:T0:L1" Failed:
2025-03-20T11:34:51.665Z cpu39:2098225)NMP: nmp_ThrottleLogForDevice:3875: H:0x5 D:0x0 P:0x0 . Act:EVAL. cmdId.initiator=0x430794f34180 CmdSN 0x1ef8b8e
2025-03-20T11:34:51.665Z cpu39:2098225)ScsiDeviceIO: 4163: Cmd(0x45d92bdfae88) 0x2a, cmdId.initiator=0x430794f34180 CmdSN 0x1ef8b8e from world 2097233 to dev "naa.600a09803830454f572b4c57494a4269" failed H:0x5 D:0x0 P:0x0 Cancelled from driver layer >>This status is returned if the driver has to abort commands in-flight to the target. This can occur due to a command timeout or parity error in the frame.

  • Similar error events were triggered during standard file copy operations.

2025-03-20T09:49:30.574Z cpu45:2097279)NMP: nmp_ThrottleLogForDevice:3815: last error status from device naa.60050768108102277000000000000007 repeated 1 times
2025-03-20T09:49:30.574Z cpu45:2097279)NMP: nmp_ThrottleLogForDevice:3867: Cmd 0x2a (0x45d927208c88, 2099942) to dev "naa.60050768108102277000000000000007" on path "vmhba2:C0:T2:L0" Failed:
2025-03-20T09:49:30.574Z cpu45:2097279)NMP: nmp_ThrottleLogForDevice:3875: H:0x8 D:0x0 P:0x0 .Act:EVAL. cmdId.initiator=0x4322ea201220 CmdSN 0x11 >>This status is returned when the HBA driver has aborted the I/O. It can also occur if the HBA does a reset of the target.
2025-03-20T09:49:30.574Z cpu45:2097279)ScsiDeviceIO: 4096: Cmd(0x45d927208c88) 0x2a, cmdId.initiator=0x4322ea201220 CmdSN 0x11 from world 2099942 to dev "naa.60050768108102277000000
000000007" failed H:0x8 D:0x0 P:0x0
2025-03-20T09:49:30.574Z cpu38:16192939)HBX: 5760: Reclaiming HB at 4050944 on vol 'Non-Prod_Cluster1_IBM1' replayHostHB: 0 replayHostHBgen: 0 replayHostUUID: (00000000-00000000-000
0-000000000000).
2025-03-20T09:49:30.576Z cpu49:4277801 opID=f30af658)HBX: 3058: 'Non-Prod_Cluster1_IBM1': HB at offset 4050944 - Waiting for timed out HB:

  • The /var/run/log/hostd.log file logs NFC copy errors, including messages like “Cannot find the file specified” and “No such device or address,” further supporting underlying storage connectivity issues.

2025-03-20T09:49:45.561Z info hostd[4277801] [Originator@6876 sub=DiskLib opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] DISKLIB-D
SCPTR: /vmfs/volumes/5f3a7daf-e1078b9a-ecce-0090fada2b68/<vm_name>/<vm_name>.vmdk: Couldn't open descriptor file for writing: The system cannot find the file specified (25).
2025-03-20T09:49:45.563Z info hostd[4277801] [Originator@6876 sub=DiskLib opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] DISKLIB-LIB   : Failed to close handle "5015AD90E0".
2025-03-20T09:49:45.564Z info hostd[4277801] [Originator@6876 sub=Libs opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] OBJLIB-LIB:Failed to get VCFS root path for '/vmfs/volumes/5f3a7daf-e1078b9a-ecce-0090fada2b68/<vm_name>/<vm_name>.vmdk': No such file or directory (131076).
2025-03-20T09:49:45.565Z info hostd[4277801] [Originator@6876 sub=Libs opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] OBJLIB-FILEBE : FileBEOpen: can't open '/vmfs/volumes/5f3a7daf-e1078b9a-ecce-0090fada2b68/<vm_name>/<vm_name>.vmdk' : Could not find the file (393218).
2025-03-20T09:49:45.565Z info hostd[4277801] [Originator@6876 sub=DiskLib opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] DISKLIB-LIB_CLONE   : Failed to clone : No such device or address (393225).
2025-03-20T09:49:45.565Z warning hostd[4277801] [Originator@6876 sub=Libs opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] [NFC ERROR]Nfc_DiskLib_Clone: Failed to create VMFSEx2 disk /vmfs/volumes/5f3a7daf-e1078b9a-ecce-0090fada2b68/<vm_name>/<vm_name>.vmdk : No such device or address
2025-03-20T09:49:45.565Z warning hostd[4277801] [Originator@6876 sub=Libs opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] [NFC ERROR]NfcFileDskClone: Failed to clone disk at destination /vmfs/volumes/5f3a7daf-e1078b9a-ecce-0090fada2b68/<vm_name>/<vm_name>.vmdk: No such device or address (393225)
2025-03-20T09:49:45.566Z info hostd[4277801] [Originator@6876 sub=NfcManager opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] NfcErrorCode communicated as part of fault
2025-03-20T09:49:45.567Z error hostd[4277801] [Originator@6876 sub=NfcManager opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] Error encountered while processing copy spec for file [ds:///vmfs/volumes/65d6cff2-0c1aeee6-9746-0090fada2b68/<vm_name>/<vm_name>.vmdk -> ds:///vmfs/volumes/5f3a7daf-e1078b9a-ecce-0090fada2b68/<vm_name>/<vm_name>.vmdk]:
--> N3Vim5Fault16NetworkCopyFault9ExceptionE(Fault cause: vim.fault.NetworkCopyFault
--> )
--> [context]zKq7AVICAgAAAKEvNgETaG9zdGQAACJDF2xpYnZtYWNvcmUuc28AATp37WxpYnZpbS10eXBlcy5zbwCBF8EHAYFRFwgBgZ7OAwEC5NFdaG9zdGQAApPdXQIfgF0CFYRdAp+qXQIQd10Cbi1dA1HTAWxpYm5mYy10eXBlcy5z
bwACipJSAKzHLQA0Ay4A4hA/BDt9AGxpYnB0aHJlYWQuc28uMAAFbdEObGliYy5zby42AA==[/context]
2025-03-20T09:49:45.579Z error hostd[4277801] [Originator@6876 sub=NfcManager opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator] Copy operation failed with error: N3Vim5Fault16NetworkCopyFault9ExceptionE(Fault cause: vim.fault.NetworkCopyFault
--> )
--> [context]zKq7AVICAgAAAKEvNgETaG9zdGQAACJDF2xpYnZtYWNvcmUuc28AATp37WxpYnZpbS10eXBlcy5zbwCBF8EHAYFRFwgBgZ7OAwEC5NFdaG9zdGQAApPdXQIfgF0CFYRdAp+qXQIQd10Cbi1dA1HTAWxpYm5mYy10eXBlcy5z
bwACipJSAKzHLQA0Ay4A4hA/BDt9AGxpYnB0aHJlYWQuc28uMAAFbdEObGliYy5zby42AA==[/context]
2025-03-20T09:49:45.583Z info hostd[4277801] [Originator@6876 sub=Vimsvc.TaskManager opID=m708f45j-1873922-auto-145xf-h5:70265891-df-01-d0e2 user=vpxuser:VSPHERE.LOCAL\Administrator
] Task Completed : haTask--nfc.NfcManager.copy-30490951 Status error

Resolution

The fabric vendor should be consulted to investigate and resolve potential issues within the Fiber Channel infrastructure, including cabling, SFPs, and interconnects.