Alert received stating Controller encountered a fatal error and was reset

book

Article ID: 171916

calendar_today

Updated On:

Products

Security Analytics

Issue/Introduction

Please Run

megacli -fwtermlog -dsply -a0 -nolog  > fwterm.out

and attach the fwterm.out file to the case.


[[email protected] ~]# grep reset /var/log/messages | grep fatal

Jan  1 00:00:01 localhost disk_subsystem[19706]: snlog: sn="x.x.x.x" id="DS" m="23" c="6" event="DISK_STATUS" category="HARDWARE" ip="x.x.x.x" model="R620" msg="Adapter 0; seqNum: 0x0005be9b; Time: Sat Jan  1 00:00:02 2000; Event Description: Controller encountered a fatal error and was reset; "

Also captured the logs from the RAID controller by running as root: megacli -fwtermlog -dsply -a0 -nolog 

fwterm logs showed:

T1: EVT#376474-T1:   1=Firmware version 3.130.05-2086
01/01/00  0:00:02: EVT#376475-01/01/00  0:00:02: 345=Controller encountered a fatal error and was reset
01/01/00  0:00:02: Initializing the Temperature Monitor

Upon review with Dell it was noted that the Firmware version is way behind.

Jun  6 04:00:01 localhost disk_subsystem[19706]: snlog: sn="x.x.x.x" id="DS" m="23" c="6" event="DISK_STATUS" category="HARDWARE" ip="x.x.x.x" model="R620" msg="Adapter 0; seqNum: 0x0005be9b; Time: Sat Jan  1 00:00:02 2000; Event Description: Controller encountered a fatal error and was reset; "

 

T1: EVT#376474-T1:   1=Firmware version 3.130.05-2086
01/01/00  0:00:02: EVT#376475-01/01/00  0:00:02: 345=Controller encountered a fatal error and was reset
01/01/00  0:00:02: Initializing the Temperature Monitor

Cause

RAID Firmware is not up to date

Resolution

Dell recommends that you upgrade the firmware for the internal H710 RAID controller.  It will require 5-10 minutes to install the update and require a 10 minute reboot.

Download the PERC RAID Controller firmware for your specific system.  Look for the Red Hat Linux Update Package (.BIN).  For Example:

H710: http://www.dell.com/support/home/us/en/04/drivers/driversdetails?driverid=kkr9j

H730: https://www.dell.com/support/home/us/en/04/drivers/driversdetails?driverid=f675y

Please download the Red Hat Linux Update Package (.BIN) file to /home on the sensor.

To update the Firmware do the following:

  • Login as root.
  • Shutdown Security Analytics using: scotus stop
  • To install the firmware:
    • chmod 755 /home/SAS-RAID_Firmware_XXXXX_LN_VERSION_AXX.BIN
  • Update the firmware:
    • /home/SAS-RAID_Firmware_XXXXX_LN_VERSION_AXX.BIN

This will take a few minutes, There will be verification that the firmware is the correct firmware for the hardware and lots of license details.  Once it is done, it will reboot the system.  

SA should come online after the reboot.