cfgAgent process crash on ESXi host causes an NSX alarm, Application on NSX node <hostname> has crashed
search cancel

cfgAgent process crash on ESXi host causes an NSX alarm, Application on NSX node <hostname> has crashed

book

Article ID: 322495

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • You are running NSX 4.x.
  • In the NSX-T manager UI, a Critical alarm or alarms are generated with the following details:
Application on NSX node <hostname> has crashed. The number of core files found is 1. Collect the Support Bundle including core dump files and contact VMware Support team. Recommended Action Collect Support Bundle for NSX node <hostname> using NSX Manager UI or API.
  • On the ESXi host, you see the following core dump:
/var/core/nsx-cfgagent-zdump.XXX
  • In the log file /var/run/log/vobd.log you see entries:
[esx.problem.application.core.dumped] An application (/usr/lib/vmware/nsx-cfgagent/bin/nsx-cfgagent) running on ESXi host has crashed (1 time(s) so far). A core file may have been created at /var/core/nsx-cfgagent-zdump.000.
  • In the logfile /var/run/log/nsx-syslog.log we see the following entries:
CFGAGENT_ALLOC_FAIL : CfgAgent error: no memory!

Environment

VMware NSX-T Data Center
VMware NSX

Cause

The NSXT agent running on the host, cfgAgent, is running out of memory when the IP reputation feature is enabled and is updating.

Resolution

This issue is resolved in VMware NSX 4.1.2.2
This issue is resolved in VMware NSX 4.2.0


Workaround:

  • If the IP Reputation feature is not used, you can disable IP Reputation checks to prevent this issue from occurring again.
    In the NSX-T UI, go to SecurityDistributed Firewall and click on Actions

    Under SettingsMalicious IP Feeds
    Change Auto-update Malicious IPs from On to Off.

  • If uptime of nsx-cfgagent is approaching 90 days, the service can be restarted manually with:
    /etc/init.d/nsx-cfgagent restart