ESXi Host Abruptly Reboots from Fatal Error in IPMI System Event Log
search cancel

ESXi Host Abruptly Reboots from Fatal Error in IPMI System Event Log

book

Article ID: 414230

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

An ESXi host reboots abruptly, which subsequently triggers a vSphere HA failover where virtual machines are shut down and restarted on other hosts in the cluster.

Environment

VMware vSphere ESXi 7.X

VMware vSphere ESXi 8.X

Cause

The output of the command "esxcli hardware ipmi sel list" shows Fatal/Non-Recoverable error before the reboot event. This command displays the IPMI System Event Log (SEL) on an ESXi host, which contains records of system events useful for troubleshooting from Hardware perspective. 

**esxcli hardware ipmi sel list**
Record:##
   Record Id: ##
   When: YYYY-MM-DD
   Event Type: 7 (Fatal/NonRecoverable)
   SEL Type: 2 (System Event)
   Message: Assert + Chassis Transition to Non-recoverable from less severe
   Sensor Number: ##

Resolution

Engage the Hardware vendor for checking the physical server, based on the Fatal/NonRecoverable event sent to ESXi from the IPMI controller. Please note that the IPMI information will also appear in the out-of-band management tool (iLO, iDrac, etc..) for the host.

The Intelligent Platform Management Interface (IPMI) defines standards on how monitoring and control of system subsystems. These standards are also used for monitoring elements such as temperatures, voltages, fans, bus errors, memory, and so on. This system provides a variety of alarm mechanisms when a system exceeds its tolerance levels.

For example, an error for a processor might be displayed actively but only while the error is active. The point of the logging mechanism is to determine if an error occurred in the past which can indicate that the host is still experiencing fault conditions and might not be reporting these faults.

This generally warrants more detailed investigation with the hardware vendor.

Additional Information

Unexpected ESXi Reboot or Shutdown