"BUG: soft lockup - CPU#<id> stuck for 10s!" message seen in the logs, APM crashed and restarted

book

Article ID: 168763

calendar_today

Updated On:

Products

XOS APM

Issue/Introduction

A VAP member rebooted and produced a crash output containing a line similar to the following:

kernel: BUG: soft lockup - CPU#5 stuck for 10s! [fwk10_0:17177] 

Cause

The reboot and error result when a prolonged soft lockup occurs. The cause of the soft lockup itself needs to be investigated depending on the crash output.

Resolution

In XOS 9.6.8+, XOS 9.7.3+, and XOS 10.0.0+, XOS can detect a prolonged CPU soft lockup condition and trigger an automatic restart of the APM,

From the Release Notes: 

ID 102504 - Enhanced XOS to detect a prolonged CPU soft lockup condition, and automatically cause the APM to restart, (activating the XOS redundancy features) when the lockup condition is recognized (Red Hat ID 445422).


To disable this functionality, do one of the following:
  1. Run "echo 0 > /proc/sys/kernel/softlockup_panic" This change does not survive a VAP member reboot.
  2. Change sysctl.conf (kernel.softlockup_panic = 0) on the VAP member to make the change permanent.

Workaround

N/A