Symptoms:
- NSX-T Data Center version < 3.2.1.2
- NSX-T Load Balancer configured with source IP persistence profile
- UI reports high CPU usage on the Edge
- nginx core dumps generated followed by High or 100% CPU on the nginx worker process
- Clients cannot connect to the LB backend servers
- The LB operation process spikes to 100%
Relevant log location
Log Indicating that a nginx coredump is generated/var/log/syslog2022-08-13T11:04:24.986Z edge-02.corp.local NSX 22492 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="WARNING"] Core file generated: /var/log/core/core.nginx.1660388664.16937.134.11.gz
Log indicating the load-balancer CPU usage is very high just after the generation of the coredump/var/log/syslog2022-08-13T11:09:12.986Z edge-02.corp.local NSX 2781 - [nsx@6876 comp="nsx-edge" s2comp="nsx-monitoring" entId="a8xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx71" tid="3145" level="WARNING" eventState="On" eventFeatureName="load_balancer" eventSev="warning" eventType="lb_cpu_very_high"] The CPU usage of load balancer a8xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx71 is very high. The threshold is 95%.
edge-02> get processestop - 09:56:10 up 34 days, 17:09, 0 users, load average: 2.86, 2.16, 1.73
Tasks: 268 total, 4 running, 166 sleeping, 0 stopped, 14 zombie
%Cpu(s): 2.6 us, 2.6 sy, 0.0 ni, 94.6 id, 0.1 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem : 32734844 total, 6939616 free, 18403544 used, 7391684 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 13488368 avail Mem
/opt/vmware/nsx-netopa/bin/agent.py14111 lb 20 0 598552 84140 3672 R 100.0 0.3 3:56.79 14111 nginx: worker process
15422 lb 20 0 624816 83956 3404 R 100.0 0.3 0:45.09 15422 nginx: worker process