ESXi 8.0U3 & Later -- Storage vMotion stays at 33% and then times out with errors -- "Connection closed by remote host, possibly due to timeout." / Migration failed after VM memory precopy
book
Article ID: 395119
calendar_today
Updated On:
Feedback
Subscribe
Products
VMware vSphere ESXi
VMware vSphere ESXi 8.0
VMware vSphere ESX 8.x
Show More
Show Less
Issue/Introduction
Storage vMotion fails with one or both of the following errors:
Failed to read stream keepalive: Connection closed by remote host, possibly due to timeout. Failed to establish transport connection.
Migration failed after VM memory precopy. Please check vmkernel log for true error. Timed out waiting for migration data.
Failed vMotion is having as Target Datastore one of the following:
NVME Over TCP Pure Storage FlashArray
NVMe Over FC Pure FlashArray
Symptoms - One or more do apply:
Storage vMotion not failed on other Datastore with the same backend storage.
Storage vMotion succeeded on the same Datastore on previous installed Build ESXi 8.0U2
In the vmware.log of the affected VM, you observe similar messages as the following:
YYYY-MM-DDTHH:MM:SS.XXXZ In(05) worker-3046319 - Disk/File copy started for /vmfs/volumes/<uuid-of-datastore>/<VM name>/<VM name>.vmdk.
YYYY-MM-DDTHH:MM:SS.XXXZ Wa(03) vmx - SVMotion: scsi0:0: Disk transfer rate slow: 0 kB/s over the last 10.00 seconds, copied total 0 MB at 0 kB/s.
YYYY-MM-DDTHH:MM:SS.XXXZ In(05) worker-3046545 - Migrate: Remote Log: Destination waited for 130.23 seconds.
YYYY-MM-DDTHH:MM:SS.XXXZ In(05) vmx - [msg.migrate.fail.source.afterPrecopy] Migration failed after VM memory precopy. Please check vmkernel log for true error.
In vmkernel.log, you can observe similar message as the following:
YYYY-MM-DDTHH:MM:SS.XXXZ Wa(180) vmkwarning: cpu39:2104613)WARNING: NVMEIO:2644 command 0x45ba31874cc0 failed: ctlr 262, queue 6, psaCmd 0x45ba38ae5840,status 0x6 , opc 0x19 , cid 45, nsid 36
opc 0x19: XCOPY command
status 0x6: Storage controller returned to XCOPY command, which means GC internal error and is retriable.
Cause
The issue is related to Storage Level Replication on PURE Storage.
The Storage controller (= vmhba) will return an invalid opcode if the Backend Storage (= PURE Storage) does not support XCOPY.
This will inform ESXi to retry I/O without XCOPY.
Additional Information
"XCOPY is one of the VAAI primitives that is used for offloading tasks to the storage array.
For example, you can use XCOPY to offload such operations as migration or cloning of virtual machines to the array
Disabling XCOPY will result in losing these benefits.
Feedback
thumb_up
Yes
thumb_down
No