Multiple Virtual Machine Crashes with Unrecoverable VMM Fault 12 or 13
search cancel

Multiple Virtual Machine Crashes with Unrecoverable VMM Fault 12 or 13

book

Article ID: 427932

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Virtual Machines (VMs) running on an ESXi host experience sudden crashes. The VMs enter a powered-off state or become unresponsive, reporting an unrecoverable Virtual Machine Monitor (VMM) fault.

You notice the following symptoms:

  • The <VM name>.log file contains entries similar to:

    YYYY-MM-DDT 23:08:16.064Z In(05) vcpu-5 - Msg_Post: Error YYYY-MM-DDT 23:08:16.064Z In(05) vcpu-5 - [msg.log.error.unrecoverable] VMware ESX unrecoverable error: (vcpu-5) YYYY-MM-DDT 23:08:16.064Z In(05)+ vcpu-5 - vcpu-1:VMM fault 12: src=MONITOR rip=0x################ regs=0x################ LBR stack=0x################

  • The vmkernel.log on the ESXi host reports the below error on the same physical CPU, but across different VMs. CPU# 40 in the below examples

    YYYY-MM-DDT 23:08:05.001Z cpu40:########)WARNING: World: vm ########: ####: vmm#: <VM Name>:vcpu-#:VMM fault 12: src=MONITOR rip=0x################ regs=0x################ LBR stack=0x################

  • The host does not experience a PSOD(Purple Screen of Death) 

Environment

VMware ESX

Cause

When VMM faults (such as Fault 12 or 13) occur across multiple independent virtual machines but are consistently associated with the same physical CPU ID  it indicates the physical processor is malfunctioning.

VMM(Virtual Machine Monitor) Faults indicate that a fault has occurred causing a virtual CPU to enter the shutdown state. If this fault had occurred outside of a virtual machine, it would have caused a Server PSOD.

Guest OS drivers and applications can cause VMM faults as well.

Resolution

To resolve this issue, follow these steps:

  • Identify the Physical Processor - Determine which physical CPU package corresponds to the CPU IDs reported in the logs.
  • Run the hardware vendor's comprehensive offline diagnostics suite on the affected ESXi host to confirm processor or cache errors.
  • Ensure the host's BIOS/UEFI and CPU microcode are at the latest levels recommended by the hardware vendor.
  • Engage Hardware Vendor.
  • Replace the failing physical CPU package as identified by the vendor diagnostics.

Additional Information

While VMM faults can occasionally be triggered by guest-level drivers or software, the concentration of faults on a specific physical CPU across different virtual machines confirms a hardware-layer defect.