Application crash on NSX manager node generates /image/core/messaging_manager_oom.hprof core file
search cancel

Application crash on NSX manager node generates /image/core/messaging_manager_oom.hprof core file

book

Article ID: 395616

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • An alarm similar to the following is generated in NSX stating that an application has crashed:

    Application on NSX node <NSX manager node> has crashed. The number of core files found is 1. Collect the Support Bundle including core dump files and contact VMware Support team.

  • There is a core file on <NSX manager node> at /image/core/messaging_manager_oom.hprof.
  • The application crash can happen on a local or global manager node.
  • The messaging-manager service is restarted automatically and shows as running on <NSX manager node>

    > get service messaging-manager
    Mon Apr 28 2025 UTC 13:15:48.492
    Service name:      messaging-manager
    Service state:     running

Environment

VMware NSX 4.x below 4.1.2.0

Cause

In NSX versions prior to 4.1.2.0, the messaging manager process would attempt to allocate more memory than was available. This caused the process to crash, generate a core file, and restart.

/var/log/messaging-manager/tanuki.log:

STATUS | wrapper  | 2025/04/23 01:10:33 | Pinging the JVM took 21 seconds to respond.
STATUS | wrapper  | 2025/04/23 01:10:33 | Pinging the JVM took 16 seconds to respond.
STATUS | wrapper  | 2025/04/23 01:10:33 | Pinging the JVM took 11 seconds to respond.
INFO   | jvm 4    | 2025/04/23 01:10:34 | java.lang.OutOfMemoryError: Java heap space
STATUS | wrapper  | 2025/04/23 01:10:34 | The JVM has run out of memory.  Requesting thread dump.
STATUS | wrapper  | 2025/04/23 01:10:34 | Dumping JVM state.
STATUS | wrapper  | 2025/04/23 01:10:34 | The JVM has run out of memory.  Restarting JVM.
INFO   | jvm 4    | 2025/04/23 01:10:34 | Dumping heap to /image/core/messaging_manager_oom.hprof ...
INFO   | jvm 4    | 2025/04/23 01:10:34 | Heap dump file created [96919135 bytes in 0.345 secs]
INFO   | jvm 4    | 2025/04/23 01:10:34 | 2025-04-23 01:10:34
INFO   | jvm 4    | 2025/04/23 01:10:34 | Full thread dump OpenJDK 64-Bit Server VM (25.372-b07 mixed mode):
INFO   | jvm 4    | 2025/04/23 01:10:34 |
INFO   | jvm 4    | 2025/04/23 01:10:34 | "Table-Forkjoin-pool-0" #101301 daemon prio=5 os_prio=0 tid=0x000073816c008800 nid=0xdc26c waiting on condition [0x0000738139a13000]
INFO   | jvm 4    | 2025/04/23 01:10:34 |    java.lang.Thread.State: TIMED_WAITING (parking)
INFO   | jvm 4    | 2025/04/23 01:10:34 |       at sun.misc.Unsafe.park(Native Method)
INFO   | jvm 4    | 2025/04/23 01:10:34 |       - parking to wait for  <0x00007382b419e5c0> (a java.util.concurrent.ForkJoinPool)
INFO   | jvm 4    | 2025/04/23 01:10:34 |       at java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:1824)
INFO   | jvm 4    | 2025/04/23 01:10:34 |       at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1693)
INFO   | jvm 4    | 2025/04/23 01:10:34 |       at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
...
INFO   | jvm 4    | 2025/04/23 01:10:54 | Heap
INFO   | jvm 4    | 2025/04/23 01:10:54 |  garbage-first heap   total 181248K, used 79863K [0x00007382b3800000, 0x00007382b3900588, 0x00007382c0000000)
INFO   | jvm 4    | 2025/04/23 01:10:54 |   region size 1024K, 26 young (26624K), 0 survivors (0K)
INFO   | jvm 4    | 2025/04/23 01:10:54 |  Metaspace       used 96367K, capacity 100752K, committed 100812K, reserved 1138688K
INFO   | jvm 4    | 2025/04/23 01:10:54 |   class space    used 11644K, capacity 12450K, committed 12492K, reserved 1048576K
INFO   | jvm 4    | 2025/04/23 01:10:54 |
INFO   | wrapper  | 2025/04/23 01:10:55 | JVM exited on its own while waiting to kill the application.
STATUS | wrapper  | 2025/04/23 01:11:00 | JVM process is gone.
STATUS | wrapper  | 2025/04/23 01:11:00 | Launching a JVM...
INFO   | jvm 5    | 2025/04/23 01:11:00 | WrapperManager: Initializing...
INFO   | jvm 5    | 2025/04/23 01:11:02 |
INFO   | jvm 5    | 2025/04/23 01:11:02 |   .   ____          _            __ _ _
INFO   | jvm 5    | 2025/04/23 01:11:02 |  /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
INFO   | jvm 5    | 2025/04/23 01:11:02 | ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
INFO   | jvm 5    | 2025/04/23 01:11:02 |  \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
INFO   | jvm 5    | 2025/04/23 01:11:02 |   '  |____| .__|_| |_|_| |_\__, | / / / /
INFO   | jvm 5    | 2025/04/23 01:11:02 |  =========|_|==============|___/=/_/_/_/
INFO   | jvm 5    | 2025/04/23 01:11:02 |  :: Spring Boot ::                (v2.6.8)
INFO   | jvm 5    | 2025/04/23 01:11:02 |
INFO   | jvm 5    | 2025/04/23 01:11:02 | Apr 23, 2025 1:11:02 AM org.hibernate.validator.internal.util.Version <clinit>
INFO   | jvm 5    | 2025/04/23 01:11:02 | INFO: HV000001: Hibernate Validator 6.0.23.Final
INFO   | jvm 5    | 2025/04/23 01:11:08 | Apr 23, 2025 1:11:08 AM org.apache.coyote.AbstractProtocol init
INFO   | jvm 5    | 2025/04/23 01:11:08 | INFO: Initializing ProtocolHandler ["http-nio-127.0.0.1-4321"]
INFO   | jvm 5    | 2025/04/23 01:11:08 | Apr 23, 2025 1:11:08 AM org.apache.catalina.core.StandardService startInternal
INFO   | jvm 5    | 2025/04/23 01:11:08 | INFO: Starting service [Tomcat]
INFO   | jvm 5    | 2025/04/23 01:11:08 | Apr 23, 2025 1:11:08 AM org.apache.catalina.core.StandardEngine startInternal
INFO   | jvm 5    | 2025/04/23 01:11:08 | INFO: Starting Servlet engine: [Apache Tomcat/9.0.73]
INFO   | jvm 5    | 2025/04/23 01:11:08 | Apr 23, 2025 1:11:08 AM org.apache.catalina.core.ApplicationContext log
INFO   | jvm 5    | 2025/04/23 01:11:08 | INFO: Initializing Spring embedded WebApplicationContext
INFO   | jvm 5    | 2025/04/23 01:11:08 | Apr 23, 2025 1:11:08 AM org.apache.coyote.AbstractProtocol start
INFO   | jvm 5    | 2025/04/23 01:11:08 | INFO: Starting ProtocolHandler ["http-nio-127.0.0.1-4321"]
STATUS | wrapper  | 2025/04/23 01:49:02 | Pinging the JVM took 7 seconds to respond.
INFO   | wrapper  | 2025/04/23 01:49:24 | Wrapper Process has not received any CPU time for 20 seconds.  Extending timeouts.
STATUS | wrapper  | 2025/04/23 02:01:14 | Pinging the JVM took 24 seconds to respond.
STATUS | wrapper  | 2025/04/23 02:01:14 | Pinging the JVM took 19 seconds to respond.

Resolution

This issue is resolved in VMware NSX 4.1.2.0, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

Workaround:

If you are unable to immediately upgrade NSX to 4.1.2.0 or higher, you can remove the core file and clear the alarm as their is no impact to NSX functionality. 
See the Resolution section (Step 4) of Application on NSX node has crashed alarm for detailed instructions on removing the core file.

Additional Information