Symptoms:
On rare occasions, an Edge with single Power Supply Unit (5x0, 6x0, 7x0 and 840) may restart without noticeable trigger and customer may request RCA for that. However, SD-WAN edge reboot could be due to software issue or power fluctuation.
SD-WAN edge running on all supported VMware by Broadcom SD-WAN versions
The dmesg command is a Linux utility that displays kernel-related messages retrieved from the kernel ring buffer. The ring buffer stores information about hardware, device driver initialization, and messages from kernel modules that take place during system startup.
The dmesg command is invaluable when troubleshooting hardware-related errors, warnings, and for diagnosing device failure.
Note: Customer can enter dmesg command under Secure Edge CLI > shell. Secure Edge Access and Privileged access level are required.
The output of dmesg command (which is also included in the diagnostic bundle) can tell us the possible reason of the most recent SD-WAN edge restart. Issue "dmesg | grep -i restart" and below are the possible outputs:
edge:VCE:~# dmesg | grep -i restart
[ 5.682925] System restarted via cold reset (2)
Analysis: when observe "System restarted via cold reset (2)", the reason of most recent system restart could be:
Note: edged crash or "vc_procmon restart" command does not reboot Linux OS. dmesg command can only tell why Linux OS restarts. Customer can identify edged restart and Linux OS restart by "System up Since" and "Service Up Since" from SD-WAN edge drop-down menu on SD-WAN orchestrator web UI, more details refer KB 371603.
edge:VCE:~# dmesg | grep -i restart
[ 10.371268] System started up from powered-off state (3)
Analysis: This is usually seen on 5x0 SD-WAN edges, when observe "System started up from powered-off state (3)", the reason of most recent system restart could be:
edge:VCE:~# dmesg | grep restart
[ 5.693525] System restarted for unknown reason - possibly power cycle (13)
Analysis: This is usually seen on 6x0 SD-WAN edges, when observe "System restarted for unknown reason - possibly power cycle (13)", the reason of most recent system restart could be:
edge:VCE:~# dmesg | grep -i power
[ 1.144400] Copyright (C) 2004 MontaVista Software - IPMI Powerdown via sys_reboot.
[ 1.154138] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[ 1.162456] ACPI: Power Button [PWRF]
[ 2.324902] vc: previous power loss!
[ 9.826392] vcearlyinit: system recovered from power loss
Analysis: This is usually seen on 540/520 SD-WAN edges, when observe "vcearlyinit: system recovered from power loss", the reason of most recent system restart could be:
For power interruption issue, we can see below logs as well:
edge:(active):~# dmesg | grep -i power
[ 1.732784] IPMI poweroff: Copyright (C) 2004 MontaVista Software - IPMI Powerdown via sys_reboot
[ 1.743832] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[ 1.752154] ACPI: Power Button [PWRF]
[ 2.452554] vc: previous power loss!
[ 11.011015] vcearlyinit: system recovered from power loss