This article addresses issues related to multiple NSX Application Platform Communication errors and application crashes, specifically the nsx-exporter service. These errors occur due to Out of Memory (OOM) conditions leading to service failures. The logs show memory exhaustion and related issues that result in core dumps and system instability.
Similar log entries can be seen in the following files:
Vmkernel.log showing OOM for nsx-exporter:
####-##-##T##:##:## cpu57:32229880)Admission failure in path: host/vim/vmvisor/nsx-exporter:python.##:uw.##
####-##-##T##:##:## cpu57:32229880)UserWorld 'python' with cmdline '/usr/bin/python /bin/vsip_heap_stats.sh'
####-##-##T##:##:## cpu57:32229880)uw.## (##) extraMin/extraFromParent: 37/37, nsx-exporter (19094) childEmin/eMinLimit: 196593/196608
####-##-##T##:##:## cpu57:32229880)Admission failure in path: host/vim/vmvisor/nsx-exporter:python.32229880:uw.32229880
####-##-##T##:##:## cpu57:32229880)UserWorld 'python' with cmdline '/usr/bin/python /bin/vsip_heap_stats.sh'
####-##-##T##:##:## cpu57:32229880)uw.## (##) extraMin/extraFromParent: 36/36, nsx-exporter (19094) childEmin/eMinLimit: 196593/196608
####-##-##T##:##:## cpu53:2101431)Admission failure in path: host/vim/vmvisor/nsx-exporter:python.##:uwWorldStore.##
####-##-##T##:##:## cpu53:2101431)uwWorldStore.## (##) extraMin/extraFromParent: 1/1, nsx-exporter (19094) childEmin/eMinLimit: 196608/196608
####-##-##T##:##:## cpu53:2101431)WARNING: World: 2706: Could not allocate new world handle for world ID: 32237193: Admission check failed for memory resource
Nsx-exporter logs before core dump showing OOM related errors:
####-##-##T##:##:## nsx-exporter[2101293]: NSX 2101293 - [nsx@6876 comp="nsx-esx" subcomp="agg-service" tid="2101431" level="WARNING"] Failed to fork child process (/bin/vsip_heap_stats.sh), status: 195887124
####-##-##T##:##:## nsx-exporter[2101293]: NSX 2101293 - [nsx@6876 comp="nsx-esx" subcomp="agg-service" tid="2101430" level="ERROR" errorCode="MPA11008"] Create buffer: out of memory
####-##-##T##:##:## nsx-exporter[2101293]: NSX 2101293 - [nsx@6876 comp="nsx-esx" subcomp="agg-service" tid="2101430" level="WARNING"] Failed to get flow records: rc = 8
####-##-##T##:##:## nsx-exporter[2101293]: NSX 2101293 - [nsx@6876 comp="nsx-esx" subcomp="agg-service" tid="2101430" level="ERROR" errorCode="MPA11006"] Unable to collect flows for filter nic-##, rc=8
VMware NSX prior to 4.2.1 version.
1.) The primary cause of the issue is an Out of Memory (OOM) condition affecting the nsx-exporter process.
2.) The nsx-exporter process is responsible for monitoring NSX data, and during high memory usage scenarios, the service fails due to the lack of available memory.
3.) Logs show the following signs of OOM:
Workaround:
If the nsx-exporter application has already crashed, it will be automatically restarted, so manual intervention is not required. Restarting the process may resolve the immediate memory exhaustion (OOM) issue, but this may not be a long-term solution.
Restart the exporter if 'consumed' is nearing 'max' by executing the following command to check memory consumption by the exporter:
memstats -r group-stats -s name:min:max:consumed -u mb -g vsish -e set /sched/groupPathNameToID host vim vmvisor nsx-exporter
Here is a sample screenshot:
This will help monitor the memory usage and determine if a restart is necessary.
/etc/init.d/nsx-exporter restart
Fix:
This issue is resolved in VMware NSX 4.2.1 available at Broadcom Downloads.
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.
The nsx-exporter service is designed for monitoring and stats collection and is not critical for NSX configuration operations. After restarting, the service should restore its functionality.
The issue is fixed in NSX version 4.2.1 and beyond.