vMotion fails at 19% with error 'Operation Timed out' due to fail to lock vmx file
search cancel

vMotion fails at 19% with error 'Operation Timed out' due to fail to lock vmx file

book

Article ID: 419305

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Virtual machine vMotion fails at 19% and the task error is 'Operation Timed out'
  • The source host /var/run/log/hostd.log:

info hostd[xxxx] [Originator@xxxx sub=Vmsvc.vm:/vmfs/volumes/<datastore-uuid>/<vm-name>/<vm-name>.vmx opID=lro-<opid>] VigorMigrateNotifyCb:: hostlog state changed from emigrating to failure
info hostd[xxxx] [Originator@xxxx sub=Vcsvc.VMotionSrc.xxxxxxxxx opID=lro-<opid>] ResolveCb: VMX reports needsUnregister = false for migrateType MIGRATE_TYPE_VMOTION
info hostd[xxxx] [Originator@xxxx sub=Vcsvc.VMotionSrc.xxxxxxxxx opID=lro-<opid>] ResolveCb: Failed with fault: (vim.fault.GenericVmConfigFault) {
-->    faultMessage = (vmodl.LocalizableMessage) [
-->       (vmodl.LocalizableMessage) {
-->          key = "msg.migrate.expired",
-->          message = "Timed out waiting for migration start request.
--> ",
-->       }
-->    ],
-->    reason = "Timed out waiting for migration start request.
--> ",
-->    msg = "Timed out waiting for migration start request.
--> "
--> }

  • The destination host /var/run/log/vmkernel.log:

cpu9:xxxx opID=<opid>)DLX: 2651: vol '<Datastore-name>', lock at xxxx: Lock type: 10C00001. owner(s) MAC: xx:xx:xx:xx:xx:xx:
cpu9:xxxx opID=<opid>)[type 10c00001 offset 424550400 v 240, hb offset 3571712
gen 51, mode 1, owner xxxxxxxx-xxxxxxxx-xxxx-xxxxxxxxxxxx mtime 4230711
num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
cpu9:xxxx opID=<opid>)Fil3: 5037: Lock failed on file: <vm-name>.vmx on vol '<Datastore-name>' with FD: <FD c38 r65>

  • The destination host /var/run/log/hostd.log:

warning hostd[xxxx] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/<datastore-uuid>/<vm-name>/<vm-name>.vmx opID=<opid> user=vpxuser:<user-name>] GetWid: cancel failed: N3Vim5Fault12InvalidState9ExceptionE(Fault cause: vim.fault.InvalidState
warning hostd[xxxx] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/<datastore-uuid>/<vm-name>/<vm-name>.vmx opID=<opid> user=vpxuser:<user-name>] GetWid: Launch failed with fault: (vim.fault.Timedout) {
-->    msg = "",
--> }
error hostd[xxxx] [Originator@6876 sub=Vcsvc.VMotionDst.xxxxxxxxxx opID=<opid> user=vpxuser:<user-name>] Initiate: Failed to get the WID

  • The virtual machine /vmfs/volumes/datastore-uuid/vm-name/vmware.log:

In(05) vmx - MigrateWriteHostLog: Writing to log file took 672230 us.
In(05) vmx - MigrateSetStateFinished: type=1 new state=6
In(05) vmx - MigrateSetState: Transitioning from state 1 to 6.
No(00) vmx - ConfigDB: Setting config.readOnly = "FALSE"
In(05) vmx - Migrate_SetFailureMsgList: switching to new log file.
In(05) vmx - Migrate_SetFailureMsgList: Now in new log file.
In(05) vmx - Migrate: Caching migration error message list:
In(05) vmx - [msg.migrate.expired] Timed out waiting for migration start request.
In(05) vmx - Migrate: cleaning up migration state.
In(05) vmx - MigrateSetState: Transitioning from state 6 to 0.

  • VMs in another datastore do not have such issue.
  • Power off also fails not allowing the possibility to use cold migration

Environment

7.x

Cause

A storage-level lock (ATS or SCSI-2) on the .vmx file is held or contested, exceeding the 90-second timeout for migration start requests.

Resolution

 

  1. Perform a Storage vMotion to migrate the virtual machine to a different datastore while powered on.

  2. Log in to the ESXi host CLI as root

  3. Obtain the VM ID:
    vim-cmd vmsvc/getallvms

    Note: Take node of the VM's ID

  4. Execute the power off: 
    vim-cmd vmsvc/power.off <ID>
  5. Reattempt now the vmotion
  6. Power on the VM

 

 

Additional Information

If Storage vMotion also fails, please contact your storage vendor to investigate potential VAAI/ATS issues on the underlying array.