An S500 may be found unexpectedly down and not responding. It may come up for a short period of time and then become unresponsive in a few minutes. Connecting a USB serial cable to the console allows the user to see the console. Messages may be seen like:
2022-06-06T02:51:02-04:00 hostname_here kernel: : [16000412.476274] mce: [Hardware Error]: Machine check events logged
2022-06-06T02:51:04-04:00 hostname_here kernel: : [16000414.726654] EDAC MC1: 8 CE memory read error on CPU_SrcID#1_Ha#0_Chan#2_DIMM#1 (channel:2 slot:1 page:0x348efd9 offset:0xb80 grain:32 syndrome:0x0 - OVERFLOW area:DRAM err_code:0001:0093 socket:1 ha:0 channel_mask:4 rank:5)
The keywords in these lines is Hardware, memory and DIMM.
This does not apply to Dell hardware.
Security Analytics running on the S500 hardware.
The memory DIMMs are failing and the application accesses the failed banks and crashes. Once the system is powered up again, it will crash again at some point.
To determine if this is the problem you are seeing, look for the keyword DIMM in /var/log/messages. For example, as root run "grep DIMM /var/log/messages". For the best chance of getting a CSR from the system before it shuts down again, shut down the Security Analytics application as root with "scotus stop". You can collect a CSR from the command line by running csr.sh. The CSR will be collected and stored in /home/csr. Copy the .bz2 file to your desktop and attach it to your support case.
Sample messages from /var/log/messages
2022-06-06T02:51:02-04:00 hostname kernel: : [16000412.476274] mce: [Hardware Error]: Machine check events logged
2022-06-06T02:51:04-04:00 hostname kernel: : [16000414.726654] EDAC MC1: 8 CE memory read error on CPU_SrcID#1_Ha#0_Chan#2_DIMM#1 (channel:2 slot:1 page:0x348efd9 offset:0xb80 grain:32 syndrome:0x0 - OVERFLOW area:DRAM err_code:0001:0093 socket:1 ha:0 channel_mask:4 rank:5)
If there are failed DIMM(s), then they will need to be replaced. To replace the failed DIMMs, you will need to power down the system, if it is not already down. Remove all data and power cables in the rear of the appliance. Be sure they are labeled in a way that you will know where to replace them once the system is ready to power up. Slide the chassis forward. Then follow the instructions below.
To access the DIMMS:
Reverse the process to replace the covers. There are two plastic tabs in the side rails which need to be pulled forward to release the rail locks.