vMotion fails at 19% with error 'Operation Timed out' due to fail to lock vmx file
search cancel

vMotion fails at 19% with error 'Operation Timed out' due to fail to lock vmx file

book

Article ID: 419305

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Virtual machine vMotion fails at 19% and the task error is 'Operation Timed out'
  • The source host /var/run/log/hostd.log:

info hostd[xxxx] [Originator@xxxx sub=Vmsvc.vm:/vmfs/volumes/<datastore-uuid>/<vm-name>/<vm-name>.vmx opID=lro-<opid>] VigorMigrateNotifyCb:: hostlog state changed from emigrating to failure
info hostd[xxxx] [Originator@xxxx sub=Vcsvc.VMotionSrc.xxxxxxxxx opID=lro-<opid>] ResolveCb: VMX reports needsUnregister = false for migrateType MIGRATE_TYPE_VMOTION
info hostd[xxxx] [Originator@xxxx sub=Vcsvc.VMotionSrc.xxxxxxxxx opID=lro-<opid>] ResolveCb: Failed with fault: (vim.fault.GenericVmConfigFault) {
-->    faultMessage = (vmodl.LocalizableMessage) [
-->       (vmodl.LocalizableMessage) {
-->          key = "msg.migrate.expired",
-->          message = "Timed out waiting for migration start request.
--> ",
-->       }
-->    ],
-->    reason = "Timed out waiting for migration start request.
--> ",
-->    msg = "Timed out waiting for migration start request.
--> "
--> }

  • The destination host /var/run/log/vmkernel.log:

cpu9:xxxx opID=<opid>)DLX: 2651: vol '<Datastore-name>', lock at xxxx: Lock type: 10C00001. owner(s) MAC: xx:xx:xx:xx:xx:xx:
cpu9:xxxx opID=<opid>)[type 10c00001 offset 424550400 v 240, hb offset 3571712
gen 51, mode 1, owner xxxxxxxx-xxxxxxxx-xxxx-xxxxxxxxxxxx mtime 4230711
num 0 gblnum 0 gblgen 0 gblbrk 0] alloc owner 0
cpu9:xxxx opID=<opid>)Fil3: 5037: Lock failed on file: <vm-name>.vmx on vol '<Datastore-name>' with FD: <FD c38 r65>

  • The destination host /var/run/log/hostd.log:

warning hostd[xxxx] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/<datastore-uuid>/<vm-name>/<vm-name>.vmx opID=<opid> user=vpxuser:<user-name>] GetWid: cancel failed: N3Vim5Fault12InvalidState9ExceptionE(Fault cause: vim.fault.InvalidState
warning hostd[xxxx] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/<datastore-uuid>/<vm-name>/<vm-name>.vmx opID=<opid> user=vpxuser:<user-name>] GetWid: Launch failed with fault: (vim.fault.Timedout) {
-->    msg = "",
--> }
error hostd[xxxx] [Originator@6876 sub=Vcsvc.VMotionDst.xxxxxxxxxx opID=<opid> user=vpxuser:<user-name>] Initiate: Failed to get the WID

  • The virtual machine /vmfs/volumes/datastore-uuid/vm-name/vmware.log:

In(05) vmx - MigrateWriteHostLog: Writing to log file took 672230 us.
In(05) vmx - MigrateSetStateFinished: type=1 new state=6
In(05) vmx - MigrateSetState: Transitioning from state 1 to 6.
No(00) vmx - ConfigDB: Setting config.readOnly = "FALSE"
In(05) vmx - Migrate_SetFailureMsgList: switching to new log file.
In(05) vmx - Migrate_SetFailureMsgList: Now in new log file.
In(05) vmx - Migrate: Caching migration error message list:
In(05) vmx - [msg.migrate.expired] Timed out waiting for migration start request.
In(05) vmx - Migrate: cleaning up migration state.
In(05) vmx - MigrateSetState: Transitioning from state 6 to 0.

  • VMs in another datastore do not have such issue.

Environment

VMware vSphere ESXi

Cause

Lock failure caused the IO commands of vMotion using more than 90s. The time 90s is that vMotion can wait for the destination virtual machine starting up.  

Resolution

Engage storage vendor to check the storage.