VMware NSX Manager application crash with Out of Memory due to NSX Intelligence "Failed to process delta config updates"
search cancel

VMware NSX Manager application crash with Out of Memory due to NSX Intelligence "Failed to process delta config updates"

book

Article ID: 375048

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • In the NSX Manager web interface, the following alarm is raised: "Application on NSX node <NSX Manager VM name> has crashed. The number of core files found is 1."
  • In the NSX Manager syslog log, the following line can be found:
    2024-07-04T14:43:11.303Z FATAL pool-88-thread-3 MonitoringServiceImpl 4086 MONITORING [nsx@6876 alarmId="########-b9df-44de-ad2f-##########" alarmState="OPEN" comp="nsx-manager" entId="########-ecd5-46bd-8ad8-##########" errorCode="MP701099" eventFeatureName="infrastructure_service" eventSev="CRITICAL" eventState="On" eventType="application_crashed" level="FATAL" nodeId="########-7bf3-####-5f91-##########" subcomp="monitoring"] Application on NSX node <node_name> has crashed. The number of core files found is 1. Collect the Support Bundle including core dump files and contact VMware Support team.
  • A core dump related to the Application Crash event can be found using admin CLI: get core-dumps. Confirmed the name of the core dump being proton_oom.hprof.gz
  • The NSX intelligence delta config is failing and the NSX Manager proton (/var/log/proton/nsxapi.*) contains warning message like:
    2024-07-04T11:12:35.225Z WARN intelligence-agent-delta-config-producer DeltaConfigProducerImpl 1251406 INTELLIGENCE [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="manager"] Failed to process delta config updates.
    java.lang.IllegalArgumentException: Invalid UUID string: <name_of_an_object>
           at java.util.UUID.fromString(UUID.java:194) ~[?:1.8.0_372]
           at com.vmware.nsx.management.policy.transportnode.host.service.HostTransportNodeServiceImpl.buildUuidMsg(HostTransportNodeServiceImpl.java:776) ~[?:?]
           at com.vmware.nsx.management.policy.transportnode.host.service.HostTransportNodeServiceImpl.getHostTransportNodeByUniqueId(HostTransportNodeServiceImpl.java:795) ~[?:?]
           at com.vmware.nsx.management.policy.paceagent.services.IntelligenceAgentDeltaConfigProducerImpl.processBareMetalServerUpdate(IntelligenceAgentDeltaConfigProducerImpl.java:995) ~[?:?]
           at com.vmware.nsx.management.policy.paceagent.services.IntelligenceAgentDeltaConfigProducerImpl.processConfigUpdates(IntelligenceAgentDeltaConfigProducerImpl.java:332) ~[?:?]
           at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_372]
           at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_372]
           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_372]
           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_372]
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_372]
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_372]
            at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_372]

Environment

VMware NSX

Cause

NSX Intelligence doesn't support UUID format name which deviates from this example: "12343647-44ef-4d9d-8111-0ddc84f7d789". On using the NSX policy API, the object UUID can be manually set.

Resolution

Issue is fixed for bare metal transport-nodes in version 4.2.0

Additional Information

The NSX services are made to restart upon failure, hence the environment is self-recovering in case of such events.