HPE Gen10 servers fails with PSOD Machine Check Exception: Fatal (unrecoverable) MCE
search cancel

HPE Gen10 servers fails with PSOD Machine Check Exception: Fatal (unrecoverable) MCE

book

Article ID: 345186

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:
  • HPE Gen10 servers running ESXi 6.0.x fails with PSOD  Machine Check Exception: Fatal (unrecoverable) MCE randomly.
  • PSOD backtrace looks similar:
2017-10-01T02:40:01.688Z cpu50:55173)Backtrace for current CPU #50, worldID=55173, rbp=0x0
2017-10-01T02:40:01.688Z cpu50:55173)0x43947c29bcf8:[0x418039b0515a]Power_HaltPCPU@vmkernel#nover+0x1ee stack: 0x417ff9a83f20, 0x41804c9
2017-10-01T02:40:01.688Z cpu50:55173)0x43947c29bd48:[0x418039a12078]CpuSchedIdleLoopInt@vmkernel#nover+0x2f8 stack: 0x21aaa22dde9e0, 0x1
2017-10-01T02:40:01.688Z cpu50:55173)0x43947c29bdc8:[0x418039a157d3]CpuSchedDispatch@vmkernel#nover+0x16b3 stack: 0x43927eaa7100, 0x0, 0
2017-10-01T02:40:01.688Z cpu50:55173)0x43947c29bee8:[0x418039a16398]CpuSchedWait@vmkernel#nover+0x240 stack: 0x41003b4acde0, 0x0, 0xa000
2017-10-01T02:40:01.688Z cpu50:55173)0x43947c29bf68:[0x418039a164ea]CpuSched_VcpuHalt@vmkernel#nover+0x11e stack: 0xffffffff00002001, 0x
2017-10-01T02:40:01.688Z cpu50:55173)0x43947c29bfb8:[0x4180398ac529]VMMVMKCall_Call@vmkernel#nover+0x139 stack: 0x4180398ac074, 0x0, 0x4
2017-10-01T02:40:01.712Z cpu50:55173)ESC[45mESC[33;1mVMware ESXi 6.0.0 [Releasebuild-5572656 x86_64]ESC[0m
Machine Check Exception: Fatal (unrecoverable) MCE on PCPU50 in world 55173:vmm0:fvst-ca System has encountered a Hardware Error - Please contact the hardware vendor

 
  • ESXi6.0.x installation may also fail with an error as shown in screenshot

Note:The preceding log excerpts are only examples.Date,time and environmental variables may vary depending on your environment.


Environment

VMware vSphere ESXi 6.0

Cause

ESXi  was mapping 8 GB above the top of memory which allows the  CPU to touch addresses above the top of memory and causing the failure.

Resolution

To resolve this issue upgrade to VMware ESXi ,Patch release ESXi-6.0.0-20171104001 or later.
For more information refer to the HPE customer advisory

Disclaimer:VMware is not responsiblee for the reliability of any data,opinions,advice or statements made on third-party websites.Inclusion of such links does not imply that VMware endorses,recommends or accepts any responsibility for the content of such sites


Workaround:
To workaround this  Go HPE RBSU and set the memory to be at 1TB. Contact HPE Support for more information.