vCenter Server reports SystemBoard9 Memory or System Board 8 Memory messages under Health Status - Memory
search cancel

vCenter Server reports SystemBoard9 Memory or System Board 8 Memory messages under Health Status - Memory

book

Article ID: 343792

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

These memory messages indicate a hardware failure. This article provides more information on:
  • Uncorrectable memory failure
  • Correctable memory failures


Symptoms:
Under Health Status - Memory, vCenter Server reports messages similar to:

SystemBoard9 Memory: Uncorrectable ECC - Deassert

SystemBoard9 Memory: Correctable ECC logging limit reached - Deassert

System Board 8 Memory: Uncorrectable ECC, healthStatus 1, CurrentStatus Deassert, sensorType 12, key 60.0.32.1

System Board 8 Memory: Correctable ECC logging limit reached, healthStatus 1, CurrentStatus Deassert, sensorType 12, key 60.0.32.5


Environment

VMware vCenter Server 4.1.x
VMware vCenter Server 5.1.x
VMware vCenter Server 5.5.x
VMware vCenter Server 4.0.x
VMware vCenter Server 6.x
VMware vCenter Server 5.0.x

Resolution

Uncorrectable Memory Failure

Operational impact
The server restarts, with the affected DIMM disabled. The server can immediately return to production with the remaining memory. If the remaining memory is insufficient for production, replace the DIMM immediately or at the next maintenance opportunity.
Indications at the time of the failure
The system-error LED, MEM LED (in a server with a light path diagnostics panel), and the affected DIMM connector error LED are lit. An Uncorrectable ECC Error platform event is logged in the system-event log.
Possible root causes
Uncorrectable memory ECC error (data line), DIMM address parity error, damaged DIMM connector, damaged processor or socket.
Suggested corrective action

Replace the DIMM at the next maintenance opportunity. If the problem persists, follow the memory problem determination procedures to isolate a potentially failing part.

Correctable memory failures (Predictive Failure Analysis alert)

Operational impact

The server continues to operate, with possible degradation in performance. For example, a DIMM with a defective or open data line.
Indications at the time of the failure
The system-error LED, MEM LED (in a server with a light path diagnostics panel), and the affected DIMM connector error LED are lit. A Correctable ECC Error Rate Exceeded platform event is logged in the system-event log.
Possible root cause
The most possible root cause is the failure of the DIMM and the less likely root cause is the spurious noise caused by power rail regulation or another physical anomaly.
Suggested corrective action
Check your hardware vendor website for possible firmware updates and RETAIN tips that pertain to memory Predictive Failure Analysis alerts. Replace the DIMM at the next maintenance opportunity, because the DIMM may be failing and may result in unscheduled downtime. Follow the memory problem procedures to isolate a potential failure.

For more information and a procedure to run a full hardware diagnostic to locate the issue, contact your hardware vendor.


Additional Information

vCenter Server が健全性ステータス - メモリの状態で SystemBoard9 Memory メッセージを報告する
vCenter Server 在健康状况 - 内存下报告 SystemBoard9 Memory 或 System Board 8 Memory 消息