When this type of issue occurs, it is possible to experience one or more of the following symptoms:
ESX 8.x
When an ESX host becomes unresponsive and stops logging, this is normally a sign of an underlying hardware issue.
You may see entries in the IPMI logs similar to the following around the time of the stoppage:
Record:577: Record Id: 577 When: ####-##-##T##:##:## Event Type: 126 (Unknown) SEL Type: 2 (System Event) Message: Sensor Number: 146 Raw: Formatted-Raw: 41 02 02 83 f7 00 69 20 00 04 0c 92 7e 20 03 34
In the above example, the sensor ID correlates to memory DIMM:
Node-Sensor Description0.146 Memory Module 27 DDR5_P2_F1_ECC
A case should be open with both Broadcom Support and with the hardware vendor for a full review of the issue.