ESXi host is unresponsive due to non-recoverable hardware processor transition
search cancel

ESXi host is unresponsive due to non-recoverable hardware processor transition

book

Article ID: 427078

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

An ESXi host intermittently or permanently enters a "Not Responding" state in VMware Aria Operations or vCenter. Upon investigation of the hardware logs, the host reports fatal processor errors.

The following error messages are observed in the System Event Logs (SEL):

   Record Id: xx
   When: 2026-01-27T20:02:01
   Event Type: 7 (Fatal/NonRecoverable)
   SEL Type: 2 (System Event)
   Message: Assert + Processor Transition to Non-recoverable
   Sensor Number: 13
   Raw: 


   Record Id: xx
   When: 2026-01-27T20:02:02
   Event Type: 7 (Fatal/NonRecoverable)
   SEL Type: 2 (System Event)
   Message: Assert + Processor Transition to Non-recoverable
   Sensor Number: 13
   Raw: 

Environment

 VMware ESXi 7.x

 VMware ESXi 8.x

Cause

The issue is caused by a physical hardware failure of **Processor 2**. Specifically, the processor triggered a thermal or internal fault, resulting in a transition to a "Non-recoverable" state. This state prevents the CPU from processing instructions, leading to the host becoming unresponsive to the hypervisor and management layers.

Resolution

To resolve this issue, you must address the physical hardware failure:

1. **Identify the Faulty Component:** Review the System Event Logs (SEL) or the Hardware Status tab in vCenter to confirm which processor is reporting the fatal error (in this case, Processor 2).
2. **Contact Hardware Vendor:** Engage your server hardware manufacturer's support team to initiate a replacement for the failed CPU.
3. **Inspect Cooling:** You should have the hardware vendor verify the integrity of the heatsink and cooling fans associated with that processor socket to ensure no external factors caused the thermal event.
4. **Validate Repair:** Once the hardware is replaced, clear the hardware logs, reboot the host, and monitor for any new "Assert" events in the SEL.