Edge memory usage high or very high alarm is observed for Edges configured as load balancers.
search cancel

Edge memory usage high or very high alarm is observed for Edges configured as load balancers.

book

Article ID: 378802

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

An alarm with "The memory usage on Edge node <UUID> has reached <Current Memory Usage>% which is at or above the <very> high threshold value of <90> 80%." is triggered.

Within the Edge /var/log/vmware/top-mem.log file, you will find one or more "nginx LB OPER process listed".  An examination of any single one of these processes will show a slow memory leak over time.
/var/log/vmware # grep -i 'lb oper'  top-mem.log | less ---> Find list of PIDs for all "LB OPER" processes.

PID USER   PR    NI    VIRT    RES    SHR  S  %CPU  %MEM   TIME+   TGID   COMMAND

118## lb 20 0 547120 299756 4860 S 0.0 0.5 0:13.62 118## nginx: LB OPER process
129## lb 20 0 546844 299560 4832 S 0.0 0.5 0:13.64 129## nginx: LB OPER process
255## lb 20 0 547032 299428 4644 S 0.0 0.5 0:13.69 255## nginx: LB OPER process
55## lb 20 0 547012 299344 4576 S 0.0 0.5 0:13.61 55## nginx: LB OPER process

Then "grep" for one of the processes so that it can be looked at for possible memory leak:

/var/log/vmware # grep -i 'lb oper'  top-mem.log | grep '129**' | less

PID USER   PR    NI    VIRT    RES    SHR  S  %CPU  %MEM   TIME+   TGID   COMMAND

129## lb 20 0 555644 307872 4392 S 0.0 0.5 0:14.02 129## nginx: LB OPER process
129## lb 20 0 555280 307720 4456 S 0.0 0.5 0:14.00 129## nginx: LB OPER process
129## lb 20 0 555168 307644 4416 S 0.0 0.5 0:14.01 129## nginx: LB OPER process
129## lb 20 0 541524 307632 4740 S 0.0 0.5 0:14.03 129## nginx: LB OPER process
129## lb 20 0 541412 307396 4440 S 0.0 0.5 0:14.05 129## nginx: LB OPER process

Observe the above memory leak as available memory drops as seen in the "VIRT" column.

Environment

VMware NSX-T 3.x

Cause

Prior to NSX-T 3.2.1, "ngx_parse_url' would take some space from the ngx_cycle pool through dynamic memory allocation, but wouldn't release it.  At NSX-T 3.2.1 and above, there is a process whereby a new memory pool is specifically created for "ngx_parse_url" along with a process to destroy/delete the pool after it is no longer needed. 

 

Resolution

The fix for this issue can be found in NSX 3.2.1.

As a workaround, you can failover the load balancer from the active to the standby Edge.  Once completed, you can reboot the now standby Edge to recover from the memory leak.