VMware NSX 4.1 sudo or nvpapi application on NSX node has crashed
search cancel

VMware NSX 4.1 sudo or nvpapi application on NSX node has crashed

book

Article ID: 322432

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • You are using VMware NSX 4.1
  • You are alerted to an application crash on an NSX component.
  • There is minimal noticeable production impact.
  • In /var/log/syslog of the NSX manager you can confirm which NSX component (Manager or Edge) has the error (in this case edge_node2):
2023-06-12T01:28:45.566Z nsxmanager3 NSX 121272 - [nsx@6876 comp="nsx-manager" subcomp="nsx-sha" username="nsx-sha" level="CRITICAL" eventFeatureName="infrastructure_service" eventType="application_crashed" 
eventSev="critical" eventState="On" entId="cd59fec3-7573-4dc0-9e50-ed405376806d"] Application on NSX node edge_node2 has crashed. 
The number of core files found is 1. Collect the Support Bundle including core dump files and contact VMware Support team.
  • Checking /var/log/dumpcore.log of the above mentioned NSX node we see a core file has been generated for sudo or nvpapi:
2023-06-12T01:27:45.480Z nsxmanager3 NSX 4187495 - [nsx@6876 comp="nsx-manager" subcomp="node-mgmt" username="root" level="INFO"] Core dump generation received by process: None [nvpapi]
2023-06-12T01:27:45.488Z nsxmanager3 NSX 4187495 - [nsx@6876 comp="nsx-manager" subcomp="node-mgmt" username="root" level="WARNING"] Core file generated: /var/log/core/core.nvpapi.py.1679583589.5354.33.11.gz
  • This can be further confirmed checking /var/log/syslog of the above node for core dumps:
2023-03-23T14:59:49.659Z edge_node2.corp.local NSX 1366408 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="WARNING"] Core file generated: /var/log/core/core.nvpapi.py.1679583589.5354.33.11.gz
or
2023-03-23T14:59:49.659Z edge_node2.corp.local NSX 1366408 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="WARNING"] Core file generated: /var/log/core/core.sudo.1686533265.4187488.0.11.gz
 
  • Checking in /var/log/core/ on the node you see the core files:
root@edge01:/var/log/core# ls
total 454M
-rw-rw-rw- 1 root root 321M Jun 26 12:39 core.nvpapi.py.1679583589.5354.33.11.gz

NOTE: The preceding log excerpts are only examples. Date, time and environmental variables may vary depending on your environment.

Environment

VMware NSX 4.1.0

Cause

Services have crashed and the system generated their respective cored dump files. All NSX services are configured to be auto-restarted after hitting a crash. Depending on the application which has crashed it might be possible other services depending on it may not be functioning correctly. It is recommended to verify the services status which have crashed to confirm whether it's running. On the NSX-T appliance node, service status can be verified in nsxcli as below:
nsxcli> get service <service-name>
or
nsxcli> get services

 


Resolution

This is resolved in VMware NSX-T 4.1.1 VMware Downloads.
 


Workaround:

After collecting the support-bundle, the application crashed alarm can be resolved by removing the core dump files from the respective nodes.

  • On NSX appliance nodes, the following nsxcli command can be used to remove core and heap dump files:
nsxcli> del core-dump all
or
nsxcli> del core-dump <core-dump-file>

 

  • On ESXi host transport nodes, the following command can be used when logged in from the host as root:

root> rm -rf /var/core