Aria Automation / Orchestrator Node stuck when booting up with error message message: "Soft lockup - CPU stuck for 30s"
book
Article ID: 398814
calendar_today
Updated On:
Products
VMware Aria Suite
Issue/Introduction
- Aria Orchestrator node marked as Not Ready when viewing from
kubectl get node
- The Aria Orchestrator node is not accessible by SSH.
- When viewing the node from remote console, we see message:
"Soft lockup - CPU stuck for 30s"

Environment
VMware Aria Automation 8.x
VMware Aria Automation Orchestrator 8.x
Cause
- This could be caused if the node was stunned due to high resource utilization on the host.
- As the node was stunned when booting up, the node would not be accessible via SSH and it's pods and services would therefore be impacted.
Resolution
- Validate in vCenter, that the VM is not having a high CPU utilization,.
- Validate that enough resources are present on the host on which the vm resides.
- Validate KB395309
- Take Non-memory snapshots of the Aria Automation / Orchestrator Cluster nodes.
- Power off the Aria Automation / Orchestrator node from vCenter.
- Power on the Aria Automation / Orchestrator node from vCenter.
- Validate that the node boots up successfully now as the CPU utilization would have now refreshed.
- Wait for the first boot to complete - monitor status of command from an SSH session to the node:
watch -d vracli status first-boot
- Run the below command to validate that all three nodes In the cluster are marked as
Ready:
- Initiate a rebuild of the pods :
- Wait for the script execution to complete.
- The execution can also be tracked using the command :
watch -d kubectl get pods -n prelude.
- Once all the pods are running successfully, validate that the pod deployment has completed successfully using the below command:
watch -d vracli status deploy
- Now attempt to access the UI.
Feedback
thumb_up
Yes
thumb_down
No