vSAN host crashed when trying to put it in maintenance mode
search cancel

vSAN host crashed when trying to put it in maintenance mode

book

Article ID: 417330

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

 

  • High vSAN backend latency was observed on the ESXi host by navigating to Host > Monitor > vSAN > Performance > Backend.

  • An alert indicating excessive CPU readiness was observed at the ESXi host level.
  • The host subsequently crashed when an attempt was made to place it into maintenance mode

     

Environment

VMware vSAN 8.x

Cause

A hardware-initiated reboot or crash can cause the host to reboot abruptly. This behavior occurs regardless of whether the host is placed in maintenance mode or not.

 

  • The host abrupt reboot event due to a hardware fault encountered:

2025-09-22T13:20:34.527Z In(14) vobd[2098147]:  [GenericCorrelator] 178993165us: [vob.user.host.poweroff.reason.unavailable] The host is being powered off. The poweroff was not the result of a kernel error, deliberate reboot, or shut down. This could indicate a hardware issue. Hardware may reboot abruptly due to power outages, faulty components, and heating issues. To investigate further, engage the hardware vendor.

 
  • iDRAC logs Indicates a CPU reset — likely triggered automatically due to a hardware fault.The fatal I/O error at bus 0 device 0 function 0 and CPU internal error indicate a hardware-level fault.

 2068 | 2025-09-22T13:12:15+00:00 | Lclog/OK(IDRAC.2.9.SYS1003) System CPU Resetting.
     2 | 2025-09-22T13:16:30+00:00 | Sel/OK(72a20200) A problem was detected related to the previous server boot.
     3 | 2025-09-22T13:16:30+00:00 | Sel/Critical(6fa90000) A fatal IO error detected on a component at bus 0 device 0 function 0.
     4 | 2025-09-22T13:16:30+00:00 | Sel/OK(7ec8f8b1) An OEM diagnostic event occurred.
  2069 | 2025-09-22T13:16:30+00:00 | Lclog/OK(IDRAC.2.9.LOG007) The previous log entry was repeated 1 times.
  2070 | 2025-09-22T13:16:30+00:00 | Lclog/OK(IDRAC.2.9.CPU0000) Internal error has occurred check for additional logs.
  2071 | 2025-09-22T13:16:30+00:00 | Lclog/OK(IDRAC.2.9.PST0090) A problem was detected related to the previous server boot.
     5 | 2025-09-22T13:16:31+00:00 | Sel/OK(7e010100) An OEM diagnostic event occurred.
     6 | 2025-09-22T13:16:31+00:00 | Sel/OK(7e000000) An OEM diagnostic event occurred.
  2072 | 2025-09-22T13:16:31+00:00 | Lclog/Critical(IDRAC.2.9.PCI2000) A fatal IO error detected on a component at bus 0 device 0 function 0.
  2073 | 2025-09-22T13:16:32+00:00 | Lclog/OK(IDRAC.2.9.CPU9000) An OEM diagnostic event occurred.
  2074 | 2025-09-22T13:16:33+00:00 | Lclog/OK(IDRAC.2.9.CPU9000) An OEM diagnostic event occurred.
  2075 | 2025-09-22T13:16:33+00:00 | Lclog/OK(IDRAC.2.9.CPU9000) An OEM diagnostic event occurred.

Resolution

Engage the hardware vendor to run detailed diagnostics via the server’s management interface (e.g., iDRAC/iLO) to check for CPU faults and proceed with repair or replacement if required.