NSX Manager node(s) become unresponsive and the VM console shows errors like "BUG: soft lockup - CPU#<##> stuck for <##>s!..."
search cancel

NSX Manager node(s) become unresponsive and the VM console shows errors like "BUG: soft lockup - CPU#<##> stuck for <##>s!..."

book

Article ID: 399356

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Rebooting the affected NSX Manager VM does not resolve the issue.
  • In the NSX Manager UI from a working node, the cluster will show it is in a degraded state under System/Appliances.
  • A VM console on the affected VM shows the CPU stuck for x seconds messages like in the screenshot below.

Environment

  • VMware NSX 4.x
  • VMware NSX-T Data Center 3.x

 

Cause

High CPU usage on the ESXi host where the NSX Manager VM is running.

Resolution

The NSX Manager functionality can be restored by moving the VM to a different, healthy host and rebooting it. Verify the original host is healthy and not overprovisioned or over-utilized before moving any NSX VM back to it.

Additional Information

Note that snapshots of NSX Manager Appliance VMs should not ever be taken as explained at Disable Snapshots on an NSX Appliance. However, because the CPU lockup state can also come about due to snapshot quiesce actions, make sure that no snapshots are present. 

Refer also to vSphere ESXi article, Error: "kernel: BUG: soft lockup - CPU#Y stuck for Xs" within VM, regarding the same condition for VMs generally, not specific to NSX environments.