Error: "watchdog: BUG: soft lockup - CPU## stuck for #####!"
search cancel

Error: "watchdog: BUG: soft lockup - CPU## stuck for #####!"

book

Article ID: 397009

calendar_today

Updated On:

Products

VMware Telco Cloud Automation VMware Telco Cloud Platform VMware Tanzu Kubernetes Grid VMware vSphere ESXi

Issue/Introduction

  • The VM is completely unresponsive. Unable to ping or access via SSH.
  • The VM displays errors similar to the following on the console:
    watchdog: BUG: soft lockup - CPU#4 stuck for 143406s! [node_exporter:####]
    watchdog: BUG: soft lockup - CPU#1 stuck for 143405s! [kworker/1:1:####] 
    watchdog: BUG: soft lockup - CPU#4 stuck for 143432s! [node_exporter:####] 
    watchdog: BUG: soft lockup - CPU#0 stuck for 143086s! [runc:#######] 
    watchdog: BUG: soft lockup - CPU#1 stuck for 143431s! [containerd-shim:#######]
    watchdog: BUG: soft lockup - CPU#1 stuck for 143431s! [kthreadd:#######]

Environment

TCP 5.0
TCA 3.2
TKG 2.5.2
ESXi 8.x

Cause

  • This is deadlock issue, CPU acquires sd->input_pkt_queue.lock spinlock and again in interrupt context tries to acquire the same lock.
  • Ideally in this case spin_lock should disable the interrupts which is not happening in this case
  • Refer to the committed change here for further information: Github - Fix deadlock while reading /proc/net/softnet_stat

Resolution

 

Additional Information