VM's powered off with error: <vm name> contained the host physical page ### which was scheduled for immediate retirement.
search cancel

VM's powered off with error: <vm name> contained the host physical page ### which was scheduled for immediate retirement.

book

Article ID: 404366

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  •  VMs shut down unexpectedly.
  •  Events display messages like:

Description: <vm name> contained the host physical page ### which was scheduled for immediate retirement. To avoid system instability the virtual machine is forcefully powered off.

Event ID: esx.problem.vm.kill.unexpected.forcefulPageRetire.64.2

  • The host daemon log, /var/run/log/hostd.log, contains entries similar to:
    [timestamp] In(166) Hostd[2099322]: [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/########-########-####-############/##-###-#####/##->###-#####.vmx] Deferring power opcompletion until VM is at a stable state
    [timestamp] In(166) Hostd[2098975]: --> eventTypeId = "esx.problem.vm.kill.unexpected.forcefulPageRetire.64.2",
    [timestamp] In(166) Hostd[2098975]: --> arguments = (vmodl.KeyAnyValue) [
    [timestamp] In(166) Hostd[2098975]: --> (vmodl.KeyAnyValue) {
    [timestamp] In(166) Hostd[2099324]: [Originator@6876 sub=Vimsvc.##-########] Event 29630 : ###-###-####-####-##.3.1 contained the host >physical page 4578239 which was scheduled for immediate retirement. To avoid system instability the virtual machine is forcefully >powered off.

     

  • In the vmkernel.log (also /var/run/log), you see:
    [timestamp] Al(177) vmkalert: cpu35:19746176)ALERT: MCA: 191: UCNA Poll G0 B11 Sfc00cd00004000c2 A39ec78c80 M9000aa822088086 >P39ec78c80/40 Memory Controller Scrubbing Error on Channel 2.
    [timestamp] In(182) vmkernel: cpu35:19746176)MCAIntel: 1362: Force retiring MPN 0x39ec78 to recover from MCA error detected by cpu35 in >bank17.
    [timestamp] In(182) vmkernel: cpu32:2097368)NetPort: 1887: disabled port 0x4000024
    [timestamp] In(182) vmkernel: cpu63:2097235)NetPort: 708: Failed to acquire port non-exclusive lock 0x4000018[Failure].
    [timestamp] In(182) vmkernel: cpu32:2097368)Net: 3834: dissociate dvPort 1002 from port 0x4000024
    [timestamp] In(182) vmkernel: cpu32:2097368)Net: 3841: disconnected client from port 0x4000024

     

  • vobd.log at the same location as the other 2 logs, shows memory check exceptions as below:
    [timestamp] In(14) vobd[2097814]: [cpuCorrelator] 5604332040761us: [vob.cpu.mce.log4] MCE bank 7: status:0x9c00004001010092 >misc:0x200802c110801086 addr:0xffdf88c80 cpu:1 physAddr:0xffdf88c80 physSize:0x40 ceCount:0x1
    [timestamp] In(14) vobd[2097814]: [cpuCorrelator] 5604333223462us: [vob.cpu.mce.log4] MCE bank 7: status:0xdc00024001010092 >misc:0x200805c2b0001086 addr:0x45e57fc80 cpu:23 physAddr:0x45e57fc80 physSize:0x40 ceCount:0x9
    [timestamp] In(14) vobd[2097814]: [VMCorrelator] 5604334936215us: [vob.vm.kill.unexpected.forcefulPageRetire.64] The virtual machine >using the configuration file /vmfs/volumes/########-########-####-############/##-###-#####/##-###-#####.vmx contains the host physical >page 0x39ec78 that was scheduled for immediate retirement. To avoid system instability, the virtual machine has been powered off.
    [timestamp] In(14) vobd[2097814]: [VMCorrelator] 5604914108597us: [esx.problem.vm.kill.unexpected.forcefulPageRetire.64.2] >/vmfs/volumes/########-########-####-############/##-###-#####/##-###-#####.vmx contained the host physical page 0x39ec78 which was >scheduled for immediate retirement. To avoid system instability the virtual machine is forcefully powered off.
    [timestamp] In(14) vobd[2097814]: [cpuCorrelator] 5604341340433us: [vob.cpu.mce.log4] MCE bank 17: status:0xcc005f00002000c2 >misc:0x900222208088086 addr:0x10bfff1480 cpu:33 physAddr:0x10bfff1480 physSize:0x40 ceCount:0x17c
    [timestamp] In(14) vobd[2097814]: [pageretireCorrelator] 5604348367281us: [vob.pageretire.selectedmpnthreshold.host.exceeded] Number of >MPNs selected for retirement is 4
    [timestamp] In(14) vobd[2097814]: [VMCorrelator] 5604348367305us: [vob.vm.kill.unexpected.forcefulPageRetire.64] The virtual machine >using the configuration file /vmfs/volumes/########-########-####-############/###-###-####-####/###-###-####-####.vmx contains the >host physical page 0x45dbbf that was scheduled for immediate retirement. To avoid system instability, the virtual machine has been >powered off.
    [timestamp] In(14) vobd[2097814]: [VMCorrelator] 5604927462883us: [esx.problem.vm.kill.unexpected.forcefulPageRetire.64.2] >/vmfs/volumes/########-########-####-############/###-###-####-####/###-###-####-####.vmx contained the host physical page 0x45dbbf >which was scheduled for immediate retirement. To avoid system instability the virtual machine is forcefully powered off.

Environment

VMware vSphere ESXi 7.0.x
VMware vSphere ESXi 8.0.x

Cause

The host experienced a recoverable Machine Check Exception (MCE) that did not escalate to a full host crash. The error was isolated to a specific memory page. As a protective measure, the VM consuming that page was shut down to prevent data corruption. 

Resolution

The customer should work with their hardware vendor to perform diagnostics.