ESXi host log messages: ApeiPageRetire, Processing HEST GESB, or GEDE section type
search cancel

ESXi host log messages: ApeiPageRetire, Processing HEST GESB, or GEDE section type

book

Article ID: 413108

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

In an ESXi host, the log file vmkernel.log has entries containing the text "APEI", "ApeiPageRetire", "Processing HEST GESB", and/or "GEDE section type".

The ESXi host may have gone unresponsive or experienced other problems around the time these entries appear.

Example: 

vmkernel: cpu01:2345678) ApeiPageRetire: 730: Processing HEST GESB, severity 0x2, with 1 GEDE record(s)
vmkernel: cpu01:2345678) APEI: 346: GEDE section type GUID ########-####-####-####-############
vmkernel: cpu01:2345678) APEI: 347: errorSeverity=0x2, revision=0x300, validation=0x0, flags=0x1, dataLength=0xd0, timestamp=0x0
vmkernel: cpu01:2345678) APEI: 355: FRU GUID 00000000-0000-0000-0000-000000000000, Text ""

Under specific conditions, this behavior correlates with Physical CPU (PCPU) dispatch delays and storage I/O latency and include the following symptoms:

  • Guest VMs showing high CPU Co-stop or Ready time with low overall host utilization.
  • Unexpected vSphere HA reboots or "VM lock lost" errors.
  • Transient but severe I/O latency spikes (deterioration) across all local or vSAN storage devices.
  • ESXi host becoming intermittently unresponsive or disconnecting from vCenter.

Environment

ESXi (All) 

Hardware: 
Dell PowerEdge R640/R660/MX740c
Cisco UCS M6/M7.

Cause

If these messages appear in your host's log file, it indicates a Hardware-level memory faults (Correctable Errors/CE) trigger System Management Interrupts (SMI). During hardware error correction and APEI page retirement, the CPU is momentarily stalled (micro-stun), causing the vmkernel to miss heartbeats and leading to cascading performance

Refer to KB 391088: ESXi host encountered PSOD with the error message "Platform firmware has reported a fatal error via APEI"

"APEI (ACPI Platform Error Interface) allows system firmware (BIOS/UEFI) to report hardware or platform errors to the operating system. When ESXi logs an APEI error, it generally indicates a hardware or firmware-related issue, typically involving memory, CPU, PCIe devices, or controllers. This is not an ESXi bug, but rather a notification of an underlying hardware/firmware condition."

Resolution

  1. Identify Faulty Hardware: Review iDRAC/iLO System Event Logs (SEL) or CIMC logs for the specific physical address (physAddr) reported in vmkernel.log.
  2. Verify Firmware: Update BIOS and Memory Controller firmware to the latest vendor-validated versions (e.g., Dell BIOS 2.24.0 or newer).
  3. Hardware Remediation: Contact the OEM (Dell/Cisco) for DIMM replacement if correctable errors persist after firmware updates.
  4. Workload Protection: Proactively vMotion critical workloads off the host if errors are frequent.