No access to the Malware Analysis (MA) appliance;
Upon recovery, few things we noticed from the collected logs after the power cycle:
1. No logs were captured from when the inaccessible to the appliance reported;
Sep 6 20:39:01 mag2 CRON[29349]: (root) CMD ( [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -depth -mindepth 1 -maxdepth 1 -type f -cmin +$(/usr/lib/php5/maxlifetime) ! -execdir fuser -s {} 2>/dev/null \; -delete)
Sep 6 20:40:01 mag2 CRON[29454]: (root) CMD (/opt/fpm/bin/i2c_fpd.sh)
---no logs--
2. Once power cycle, syslog starting to capture logs and can then see some of the processes restarted;
Sep 7 12:17:11 mag2 mdadm[2219]: DeviceDisappeared event detected on md device /dev/md/0
Sep 7 12:17:11 mag2 mdadm[2219]: NewArray event detected on md device /dev/md127
This issue occurred when the MAA completed its mdadm raid check array as part of its autocheck cronjob performed every first of Sunday every month at 00:57AM. What happened was it provided the results of the disk health state but subsequently resulted to device freeze, thus customer does not have a console logs output, no access to the WebUI https, unable to access terminal via SSH and not able to ping the appliance. During the event, even the syslogs did not capture any output and the only way to recover from this state is to perform cold reboot/power cycle of the appliance.
Based on developer/engineering, this issue is due to bug in the RAID1 kernel module running on MA causes a disk IO deadlock in certain situations.
[email protected]:~$ sudo sed -i 's/^AUTOCHECK=.*/AUTOCHECK=false/' /etc/default/mdadm
*this will change the setting for AUTOCHECK to false
[email protected](none):/etc/default# cat mdadm
# mdadm Debian configuration
#
# You can run 'dpkg-reconfigure mdadm' to modify the values in this file, if
# you want. You can also change the values here and changes will be preserved.
# Do note that only the values are preserved; the rest of the file is
# rewritten.
#
# AUTOCHECK:
# should mdadm run periodic redundancy checks over your arrays? See
# /etc/cron.d/mdadm.
AUTOCHECK=false ---->
<truncated for brevity>
==============================
Rollback/Revert the changes:
[email protected]:~$ sudo sed -i 's/^AUTOCHECK=.*/AUTOCHECK=true/' /etc/default/mdadm
*this will change the setting for AUTOCHECK to true
[email protected](none):/etc/default# cat mdadm
# mdadm Debian configuration
#
# You can run 'dpkg-reconfigure mdadm' to modify the values in this file, if
# you want. You can also change the values here and changes will be preserved.
# Do note that only the values are preserved; the rest of the file is
# rewritten.
#
# AUTOCHECK:
# should mdadm run periodic redundancy checks over your arrays? See
# /etc/cron.d/mdadm.
AUTOCHECK=true ---->
<truncated for brevity>