A crash of the cfgAgent process an ESXi host causes an NSX alarm, "Application on NSX node <hostname> has crashed"
search cancel

A crash of the cfgAgent process an ESXi host causes an NSX alarm, "Application on NSX node <hostname> has crashed"

book

Article ID: 322495

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • This issue is specific to NSX versions 4.x.
  • In the NSX-T manager UI, a Critical alarm or alarms are generated with the following details:
Application on NSX node <hostname> has crashed. The number of core files found is 1. Collect the Support Bundle including core dump files and contact VMware Support team. Recommended Action Collect Support Bundle for NSX node <hostname> using NSX Manager UI or API.
  • The following core dump is present on the ESXi host:
/var/core/nsx-cfgagent-zdump.XXX
  • The log file /var/run/log/vobd.log contains entries similar to the following:
[esx.problem.application.core.dumped] An application (/usr/lib/vmware/nsx-cfgagent/bin/nsx-cfgagent) running on ESXi host has crashed (1 time(s) so far). A core file may have been created at /var/core/nsx-cfgagent-zdump.000.
  • The log file /var/run/log/nsx-syslog.log contains entries similar to the following:
CFGAGENT_ALLOC_FAIL : CfgAgent error: no memory!

Environment

VMware NSX-T Data Center
VMware NSX

Cause

The NSX-T agent running on the host, cfgAgent, ran out of memory when the IP reputation feature is enabled and is updating.

Resolution

This issue is resolved in VMware NSX 4.1.2.2
This issue is resolved in VMware NSX 4.2.0


Workaround:

  • If the IP Reputation feature is not used, you can disable IP Reputation checks to prevent this issue from occurring again.
    In the NSX-T UI, go to SecurityDistributed Firewall and click on Actions

    Under SettingsMalicious IP Feeds
    Change Auto-update Malicious IPs from On to Off.

  • If uptime of nsx-cfgagent is approaching 90 days, the service can be restarted manually with:
    /etc/init.d/nsx-cfgagent restart