NSX OpsAgent runs out of memory (OOM) and crashing causing OpsAgent core dump
search cancel

NSX OpsAgent runs out of memory (OOM) and crashing causing OpsAgent core dump

book

Article ID: 419794

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX UI shows ESXi hosts with status Unknown.

  • NSX UI shows the below alarm is generated with the following details:
    Application on NSX node <hostname> has crashed. The number of core files found is 1. Collect the Support Bundle including core dump files and contact VMware Support team. Recommended Action Collect Support Bundle for NSX node <hostname> using NSX Manager UI or API.

  • Affected ESXi hosts created opsAgent zdump files:
    /var/core/opsAgent-zdump.###

  • vobd logs may show similar messages to the following:
    An application (/usr/lib64/vmware/nsx-opsagent/bin/opsAgent) running on ESXi host has crashed (x time(s) so far). A core file may have been created at /var/core/opsAgent-zdump.000.
  • nsx-syslog message show increase of "used MEMORY" for opsAgent approaching the limit of "total MEMORY " with messages similar to the following:

    [nsx@#### comp="nsx-esx" subcomp="nsx-sha" username="root" level="INFO"] [name:resource_dump_monitor] NSX_OPSAGENT: used MEMORY 1330356KB, total MEMORY 1331200KB

    [nsx@#### comp="nsx-esx" subcomp="nsx-sha" username="root" level="INFO"] [name:resource_dump_monitor] NSX_OPSAGENT: used MEMORY 1331140KB, total MEMORY 1331200KB

  • This may cause issues with NSX tags being removed from VM's if the host, along with the VM's residing on the host, are unavailable or inaccessible for more than 30 minutes

Environment

VMware NSX 4.x

Cause

Memory usage for OpsAgent reaches maximum limit allocated of 1331200KB. Once the maximum is reached, OpsAgent cannot allocate any further memory causing a crash and core dump. 

Resolution

Workaround:

Restart nsx-opsagent to recover the system with the following command from the ESXi CLI: 

/etc/init.d/nsx-opsagent restart

If the memory usage continues to increase at a later time, the maximum allowable memory for opsagent may need to be raised. To do so follow the next steps:

1. Check for any alarms from NSX regarding NSX opsagent. If memory usage of opsagent goes above 80%, you should receive the following alarm:
The alarm reported is in regard to the memory usage of agent NSX_OPSAGENT on ESXi node xxxxxxx-xxxxx-xxxxx has reached above the high threshold value of 80%
See KB 385467 for more details about this Alarm

2. If so, in the CLI console of ESXi, run the following command to increase the maximum allowable memory for opsagent from 1300 MB to 2500 MB

localcli --plugin-dir=/usr/lib/vmware/esxcli/int sched group setmemconfig --group-path=host/vim/vmvisor/opsagent --min=1300 --max=-1 --minlimit=2500 --units=mb 2> /dev/null

3. Then restart the opsagent service

/etc/init.d/nsx-opsagent restart