A Virtual Machine doesn't resume after a failed vMotion with "Failed to reopen NVRAM: Failed to lock the file"
search cancel

A Virtual Machine doesn't resume after a failed vMotion with "Failed to reopen NVRAM: Failed to lock the file"

book

Article ID: 323590

calendar_today

Updated On:

Products

VMware

Issue/Introduction

Symptoms:
A Virtual Machine is powered off during a vMotion with the error message: "Failed to reopen NVRAM: Failed to lock the file"

ESXi log files might contain messages similar to the following:

hostd.log
2021-01-13T11:08:10.412Z verbose hostd[1001395013] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] Handling vmx message 646221: An operation required the virtual machine to quiesce and the virtual machine was unable to continue running.
-->
2021-01-13T11:08:10.412Z warning hostd[1001395013] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] Failed to find activation record, event user unknown.
2021-01-13T11:08:10.413Z verbose hostd[1001395013] [Originator@6876 sub=PropertyProvider] RecordOp REMOVE: latestPage[2750], session[528fca44-41f4-d6eb-15ab-9d76eeeba473]52572d99-d641-9e03-c2c5-c7920f13af24. Applied change to temp map.
2021-01-13T11:08:10.413Z verbose hostd[1001395013] [Originator@6876 sub=PropertyProvider] RecordOp ADD: latestPage[2760], session[528fca44-41f4-d6eb-15ab-9d76eeeba473]52572d99-d641-9e03-c2c5-c7920f13af24. Applied change to temp map.
2021-01-13T11:08:10.413Z verbose hostd[1001395013] [Originator@6876 sub=PropertyProvider] RecordOp ASSIGN: latestEvent, ha-eventmgr. Applied change to temp map.
2021-01-13T11:08:10.413Z info hostd[1001395013] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 2760 : Error message on virtual_machine0001 on esxi001.lab.vmware.local in ha-datacenter: An operation required the virtual machine to quiesce and the virtual machine was unable to continue running.
2021-01-13T11:08:10.420Z verbose hostd[1001395004] [Originator@6876 sub=Vigor.Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] VMotionInitiateSrc: Start message: The source detected that the destination failed to resume.
-->
2021-01-13T11:08:10.421Z info hostd[1001393931] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] Answered question 646221
2021-01-13T11:08:10.421Z verbose hostd[1001393931] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] VMotionStatusCb [7030278288346012248]: Failed with error [N3Vim5Fault20GenericVmConfigFaultE:0x000000e6a4232870]
2021-01-13T11:08:10.421Z verbose hostd[1001393931] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] VMotionStatusCb: Firing ResolveCb
2021-01-13T11:08:10.421Z info hostd[1001393931] [Originator@6876 sub=Vcsvc.VMotionSrc.7030278288346012248] ResolveCb: VMX reports needsUnregister = false for migrateType MIGRATE_TYPE_VMOTION
2021-01-13T11:08:10.421Z info hostd[1001393931] [Originator@6876 sub=Vcsvc.VMotionSrc.7030278288346012248] ResolveCb: Failed with fault: (vim.fault.GenericVmConfigFault) {
-->    faultCause = (vmodl.MethodFault) null,
-->    faultMessage = (vmodl.LocalizableMessage) [
-->       (vmodl.LocalizableMessage) {
-->          key = "msg.migrate.fail.dst",
-->          arg = <unset>,
-->          message = "The source detected that the destination failed to resume."
-->       }
-->    ],
-->    reason = "The source detected that the destination failed to resume."
-->    msg = "The source detected that the destination failed to resume.
--> "
--> }
2021-01-13T11:08:10.422Z verbose hostd[1001393931] [Originator@6876 sub=Vcsvc.VMotionSrc.7030278288346012248] Migration changed state from MIGRATING to DONE
2021-01-13T11:08:10.422Z verbose hostd[1001393931] [Originator@6876 sub=Vcsvc.VMotionSrc.7030278288346012248] Finish called

vmware.log
2020-10-16T00:41:14.384Z| vmx| I125: [msg.migrate.fail.dst] The source detected that the destination failed to resume.
2020-10-16T00:41:14.384Z| vmx| I125: Migrate: Attempting to continue running on the source.
2020-10-16T00:41:14.384Z| vcpu-0| I125: DISK: OPEN scsi0:0 '/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmdk' persistent R[
]
2020-10-16T00:41:14.384Z| vcpu-0| I125: DISK: failed to create nomad vob context.

[...]

2020-10-16T00:50:06.972Z| vcpu-0| I125: NVRAMMGR: NvmanReopen: Failed to reopen NVRAM
2020-10-16T00:50:06.972Z| vcpu-0| I125: NVRAMMGR: Could not write to nvram file virtual_machine0001.nvram. Setting nvram to non-persistent.
2020-10-16T00:50:06.972Z| vcpu-0| I125: Msg_Post: Error
2020-10-16T00:50:06.972Z| vcpu-0| I125: [msg.fileio.lock] Failed to lock the file
2020-10-16T00:50:06.972Z| vcpu-0| I125: [msg.disk.noBackEnd] Cannot open the disk '/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmdk' or one of the snapshot disks it depends on.
2020-10-16T00:50:06.972Z| vcpu-0| I125: [msg.checkpoint.continuesync.error] An operation required the virtual machine to quiesce and the virtual machine was unable to continue running.
2020-10-16T00:50:06.972Z| vcpu-0| I125: ----------------------------------------
2020-10-16T00:50:07.482Z| vcpu-0| W115: Migrate: Trying to 'unstun' when not stunned!
2020-10-16T00:50:07.482Z| vcpu-0| I125: Migrate: cleaning up migration state.
2020-10-16T00:50:07.482Z| vcpu-0| I125: VigorTransport_ServerSendResponse opID=lro-1-57365a4c-32100-03-01-01-28-0831 seq=8715930: Completed Migrate request.
2020-10-16T00:50:07.482Z| vcpu-0| I125: Migrate: Final status reported through Vigor.
2020-10-16T00:50:07.482Z| vcpu-0| I125: MigrateSetState: Transitioning from state 6 to 0.
2020-10-16T00:50:07.482Z| vmx| I125: Stopping VCPU threads...


Cause

This is caused by a race condition in which the destination host holds the lock when it should already have released it. This should be an exceedingly rare condition that at worst results in power off.

Resolution

This issue is resolved in:
VMware vSphere ESXi 6.7 P05 ESXi670-202103001
VMware vSphere ESXi 7.0 GA.
To download, go to  Customer Connect Downloads.