An ESXi host reboots abruptly, which subsequently triggers a vSphere HA failover where virtual machines are shut down and restarted on other hosts in the cluster.
VMware vSphere ESXi 7.X
VMware vSphere ESXi 8.X
The output of the command "esxcli hardware ipmi sel list" shows Fatal/Non-Recoverable error before the reboot event. This command displays the IPMI System Event Log (SEL) on an ESXi host, which contains records of system events useful for troubleshooting from Hardware perspective.
**esxcli hardware ipmi sel list**Record:## Record Id: ## When: YYYY-MM-DD Event Type: 7 (Fatal/NonRecoverable) SEL Type: 2 (System Event) Message: Assert + Chassis Transition to Non-recoverable from less severe Sensor Number: ##
Engage the Hardware vendor for checking the physical server, based on the Fatal/NonRecoverable event sent to ESXi from the IPMI controller. Please note that the IPMI information will also appear in the out-of-band management tool (iLO, iDrac, etc..) for the host.
The Intelligent Platform Management Interface (IPMI) defines standards on how monitoring and control of system subsystems. These standards are also used for monitoring elements such as temperatures, voltages, fans, bus errors, memory, and so on. This system provides a variety of alarm mechanisms when a system exceeds its tolerance levels.
For example, an error for a processor might be displayed actively but only while the error is active. The point of the logging mechanism is to determine if an error occurred in the past which can indicate that the host is still experiencing fault conditions and might not be reporting these faults.
This generally warrants more detailed investigation with the hardware vendor.