An ESXi host experienced a Machine check exception error and a PSOD, causing its virtual machines to shut down unexpectedly
search cancel

An ESXi host experienced a Machine check exception error and a PSOD, causing its virtual machines to shut down unexpectedly

book

Article ID: 389322

calendar_today

Updated On:

Products

VMware vSphere ESX 7.x

Issue/Introduction

  • ESXi host crashed with a PSOD due to a Machine check exception.

  • When reviewing /var/run/log/vmkernel.log on the affected ESXi, memory controller errors are observed.
    YYYY-MM-DD THH:MM:SS cpu3:#######)MCA: 209: CE Intr G0 ################# ###### ###### ######### Memory Controller Read Error on Channel 0.
    MCA bank number: 0x7

  • Storage connectivity and virtual machine failures are observed in /var/run/log/vmkernel.log
    YYYY-MM-DD THH:MM:SS Lost access to volume <volume UUID> due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
    YYYY-MM-DD THH:MM:SS [msg.hbacommon.locklost] The lock protecting '<vm-name>.vmdk' has been lost, possibly due to underlying storage issues. PANIC: Exiting because of failed disk operation.

Environment

VMware vSphere ESXi 7.x

Cause

A memory controller failure may cause severe storage I/O exceptions, leading directly to an ESXi host instability event or system termination.

Resolution

Please contact the hardware vendor for further investigation into the memory issue.

For further explanation of MCE errors, please refer to KB:
ESXi Host Becomes Unresponsive Due to Memory Controller Errors Leading to Storage I/O Issues