A Virtual Machine doesn't resume after a failed vMotion with "Failed to reopen NVRAM: Failed to lock the file"
search cancel

A Virtual Machine doesn't resume after a failed vMotion with "Failed to reopen NVRAM: Failed to lock the file"

book

Article ID: 323590

calendar_today

Updated On: 11-26-2024

Products

VMware Aria Suite

Issue/Introduction


A Virtual Machine is powered off during a vMotion with the error message: "Failed to reopen NVRAM: Failed to lock the file"

ESXi log files might contain messages similar to the following:

hostd.log

[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001395013] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] Handling vmx message 646221: An operation required the virtual machine to quiesce and the virtual machine was unable to continue running.
-->
[YYYY-MM-DDTHH:MM:SS] warning hostd[1001395013] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] Failed to find activation record, event user unknown.
[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001395013] [Originator@6876 sub=PropertyProvider] RecordOp REMOVE: latestPage[2750], session[528fca44-41f4-d6eb-15ab-9d76eeeba473]52572d99-d641-9e03-c2c5-c7920f13af24. Applied change to temp map.
[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001395013] [Originator@6876 sub=PropertyProvider] RecordOp ADD: latestPage[2760], session[528fca44-41f4-d6eb-15ab-9d76eeeba473]52572d99-d641-9e03-c2c5-c7920f13af24. Applied change to temp map.
[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001395013] [Originator@6876 sub=PropertyProvider] RecordOp ASSIGN: latestEvent, ha-eventmgr. Applied change to temp map.
[YYYY-MM-DDTHH:MM:SS] info hostd[1001395013] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 2760 : Error message on virtual_machine0001 on esxi001.lab.vmware.local in ha-datacenter: An operation required the virtual machine to quiesce and the virtual machine was unable to continue running.
[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001395004] [Originator@6876 sub=Vigor.Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] VMotionInitiateSrc: Start message: The source detected that the destination failed to resume.
-->
[YYYY-MM-DDTHH:MM:SS] info hostd[1001393931] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] Answered question 646221
[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001393931] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] VMotionStatusCb [7030278288346012248]: Failed with error [N3Vim5Fault20GenericVmConfigFaultE:0x000000e6a4232870]
[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001393931] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmx] VMotionStatusCb: Firing ResolveCb
[YYYY-MM-DDTHH:MM:SS] info hostd[1001393931] [Originator@6876 sub=Vcsvc.VMotionSrc.7030278288346012248] ResolveCb: VMX reports needsUnregister = false for migrateType MIGRATE_TYPE_VMOTION
[YYYY-MM-DDTHH:MM:SS] info hostd[1001393931] [Originator@6876 sub=Vcsvc.VMotionSrc.7030278288346012248] ResolveCb: Failed with fault: (vim.fault.GenericVmConfigFault) {
-->    faultCause = (vmodl.MethodFault) null,
-->    faultMessage = (vmodl.LocalizableMessage) [
-->       (vmodl.LocalizableMessage) {
-->          key = "msg.migrate.fail.dst",
-->          arg = <unset>,
-->          message = "The source detected that the destination failed to resume."
-->       }
-->    ],
-->    reason = "The source detected that the destination failed to resume."
-->    msg = "The source detected that the destination failed to resume.
--> "
--> }
[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001393931] [Originator@6876 sub=Vcsvc.VMotionSrc.7030278288346012248] Migration changed state from MIGRATING to DONE
[YYYY-MM-DDTHH:MM:SS] verbose hostd[1001393931] [Originator@6876 sub=Vcsvc.VMotionSrc.7030278288346012248] Finish called


vmware.log

[YYYY-MM-DDTHH:MM:SS]| vmx| I125: [msg.migrate.fail.dst] The source detected that the destination failed to resume.
[YYYY-MM-DDTHH:MM:SS]| vmx| I125: Migrate: Attempting to continue running on the source.
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: DISK: OPEN scsi0:0 '/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmdk' persistent R[]
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: DISK: failed to create nomad vob context.

[...]

[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: NVRAMMGR: NvmanReopen: Failed to reopen NVRAM
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: NVRAMMGR: Could not write to nvram file virtual_machine0001.nvram. Setting nvram to non-persistent.
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: Msg_Post: Error
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: [msg.fileio.lock] Failed to lock the file
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: [msg.disk.noBackEnd] Cannot open the disk '/vmfs/volumes/588756dg-976h8h54-2eg6-d076504019b0/virtual_machine0001/virtual_machine0001.vmdk' or one of the snapshot disks it depends on.
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: [msg.checkpoint.continuesync.error] An operation required the virtual machine to quiesce and the virtual machine was unable to continue running.
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: ----------------------------------------
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| W115: Migrate: Trying to 'unstun' when not stunned!
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: Migrate: cleaning up migration state.
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: VigorTransport_ServerSendResponse opID=lro-1-57365a4c-32100-03-01-01-28-0831 seq=8715930: Completed Migrate request.
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: Migrate: Final status reported through Vigor.
[YYYY-MM-DDTHH:MM:SS]| vcpu-0| I125: MigrateSetState: Transitioning from state 6 to 0.
[YYYY-MM-DDTHH:MM:SS]| vmx| I125: Stopping VCPU threads...



Cause

This is caused by a race condition in which the destination host holds the lock when it should already have released it. This should be an exceedingly rare condition that at worst results in power off.

Resolution

This issue is resolved in:
VMware vSphere ESXi 6.7 P05 ESXi670-202103001
VMware vSphere ESXi 7.0 GA.
To download, go to  Download Broadcom products and software