This KB article will help in understanding what CPU Load from SDWAN Edge is and how to check it.
VMware SD-WAN by VeloCloud
CPU Load can also be referred to as the run queue length: the sum of the number of processes that are currently running plus the number that are waiting to run.
Below are examples of CPU load average output from SDWAN edge. The three numbers after ‘load average’ keywords are the 1, 5, and 15 minute average based on the interval.
Example#1
08:24:37 up 23 days, 23:20, load average: 2.00, 1.37, 0.63
In this example, the command was run on an Edge 510. Since this Edge 510 has 2 CPU cores, during the minute before the command was run, the CPU load average of 2.00 means that during that minute, on average, 2 processes were using the CPU cores and no processes were waiting. For the 5-minute load average 1.37, we can understand that 1 process was running on one CPU core, 0.37 process was running on the other, so there was an idle processor core about 60% of the time. The 15-minute value 0.63 indicates that there was even more available processing time. The three numbers together show an increase in load over the last fifteen minutes.
Example#2
05:29:05 up 281 days, 20:31, 0 users, load average: 3.65, 2.59, 1.90
Still this is an Edge 510, 1min/5min load average is greater than 2 now. The load that is greater than the number of CPU cores indicates that processes are queuing. The three numbers together show an increase in load.
Example#3
18:17:23 up 8 days, 17:45, load average: 10.42, 13.13, 13.82
This is from an Edge 540 with 4 CPU cores, so we need to compare with ‘4’ now. The issue from this output here is even the 15min load value is greater than 4, meaning that 9.82 processes are waiting in the run queue while 4 processes are running by CPU. The Edge system is under a high load.
1. uptime
edge:b5-edge1:~# uptime 05:46:21 up 1 day, 36 min, load average: 0.30, 0.38, 0.39
2. top
edge:b5-edge1:~# top top - 05:46:26 up 1 day, 36 min, 0 users, load average: 0.28, 0.38, 0.39
1.
edge:b5-edge1:~# grep processor /proc/cpuinfo processor : 0 processor : 1 processor : 2 processor : 3
2.
edge:b5-edge1:~# mpstat -P ALL Linux 4.14.203 (vc-edge) 01/25/22 _x86_64_ (4 CPU) 06:03:28 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle 06:03:28 all 2.00 0.02 0.93 0.00 0.00 0.00 0.00 0.00 97.05 06:03:28 0 0.95 0.02 0.84 0.00 0.00 0.00 0.00 0.00 98.18 06:03:28 1 5.28 0.00 1.53 0.00 0.00 0.00 0.00 0.00 93.19 06:03:28 2 0.94 0.02 0.82 0.00 0.00 0.00 0.00 0.00 98.21 06:03:28 3 0.82 0.02 0.54 0.00 0.00 0.00 0.00 0.00 98.62
3.
edge:b5-edge1:~# dmesg | grep processor [ 0.001000] tsc: Detected 2095.078 MHz processor [ 0.080003] smpboot: Total of 4 processors activated (16760.62 BogoMIPS)
4.
This can also be found in an Edge diagnostic bundle.
/proc/cpuinfo COMMANDS/dmesg.out.txt
5.
Issue the top -H command, and then press 1.
top - 05:12:45 up 1 day, 18:57, 0 users, load average: 1.26, 1.19, 1.11 Tasks: 229 total, 3 running, 179 sleeping, 0 stopped, 0 zombie Cpu(s): 10.7%us, 15.6%sy, 0.0%ni, 73.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3912392k total, 2205688k used, 1706704k free, 124972k buffers Swap: 0k total, 0k used, 0k free, 344268k cached
After pressing 1:
top - 05:12:58 up 1 day, 18:57, 0 users, load average: 1.21, 1.18, 1.10 Tasks: 231 total, 2 running, 182 sleeping, 0 stopped, 0 zombie Cpu0 : 3.3%us, 6.0%sy, 0.2%ni, 90.4%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 16.8%us, 23.6%sy, 0.1%ni, 59.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Workaround:
- Check the Edge model's SD-WAN Edge Performance and Scale Data.
- Check if the Edge is carrying too many small sized packets (for example voice or video). As the packet size reduces, throughput capacity of the Edge also reduces.
- Check for any high CPU issues.