"Application has crashed"
alarm."uc_oom.hprof"
core dump file on the NSX Manager in /image/core
./var/log/upgrade-coordinator/upgrade-coordinator.log
2025-02-15T09:19:59.215Z WARN tx-tracer-poller UfoTxnTracingService 121432 SYSTEM [nsx@6876 comp="global-manager" level="WARNING" subcomp="upgrade-coordinator"] UfoTxnTracingService[id=########-####-####-####-########0ead]: long running tx has been running for seconds=7317265, numTxnAccess=1
2025-02-15T09:19:59.727Z WARN tx-tracer-poller UfoTxnTracingService 121432 SYSTEM [nsx@6876 comp="global-manager" level="WARNING" subcomp="upgrade-coordinator"] UfoTxnTracingService[id=########-####-####-####-########7748]: long running tx has been running for seconds=7478925, numTxnAccess=1
2025-02-15T09:19:59.728Z WARN tx-tracer-poller UfoTxnTracingService 121432 SYSTEM [nsx@6876 comp="global-manager" level="WARNING" subcomp="upgrade-coordinator"] UfoTxnTracingService[id=########-####-####-####-########99ae]: long running tx has been running for seconds=8263421, numTxnAccess=1
2025-02-15T09:19:59.728Z WARN tx-tracer-poller UfoTxnTracingService 121432 SYSTEM [nsx@6876 comp="global-manager" level="WARNING" subcomp="upgrade-coordinator"] UfoTxnTracingService[id=########-####-####-####-########568d]: long running tx has been running for seconds=7316784, numTxnAccess=1
2025-02-15T09:19:59.729Z WARN tx-tracer-poller UfoTxnTracingService 121432 SYSTEM [nsx@6876 comp="global-manager" level="WARNING" subcomp="upgrade-coordinator"] UfoTxnTracingService[id=########-####-####-####-########6187]: long running tx has been running for seconds=7402426, numTxnAccess=1
awk -F 'id=' '{print $2}' upgrade-coordinator*.log | awk -F ']' '{print $1}' | tr -s \\n | sort | uniq | wc -l
257
/var/log/upgrade-coordinator/upgrade-coordinator-tomcat-wrapper.log
INFO | jvm 1 | 2025/02/05 01:30:41 | java.lang.OutOfMemoryError: Java heap space
STATUS | wrapper | 2025/02/05 01:30:41 | The JVM has run out of memory. Requesting thread dump.
STATUS | wrapper | 2025/02/05 01:30:41 | Dumping JVM state.
STATUS | wrapper | 2025/02/05 01:30:41 | The JVM has run out of memory. Restarting JVM.
INFO | jvm 1 | 2025/02/05 01:30:41 | Dumping heap to /image/core/uc_oom.hprof ...
INFO | jvm 1 | 2025/02/05 01:30:42 | 2025-02-05 01:30:42
INFO | jvm 1 | 2025/02/05 01:30:42 | Full thread dump OpenJDK 64-Bit Server VM (11.0.23+10-LTS mixed mode):
INFO | jvm 1 | 2025/03/19 14:35:31 | "http-nio-127.0.0.1-7442-exec-151" #93999 daemon prio=5 os_prio=0 cpu=3.78ms elapsed=1244108.54s tid=0x00006e0cec08e000 nid=0x2d833e in Object.wait() [0x00006e0c8eda3000]
INFO | jvm 1 | 2025/03/19 14:35:31 | java.lang.Thread.State: WAITING (on object monitor)
INFO | jvm 1 | 2025/03/19 14:35:31 | at java.lang.Object.wait([email protected]/Native Method)
INFO | jvm 1 | 2025/03/19 14:35:31 | - waiting on <no object reference available>
INFO | jvm 1 | 2025/03/19 14:35:31 | at java.util.concurrent.ForkJoinTask.externalAwaitDone([email protected]/Unknown Source)
Upgrade coordinator service fails due to out of memory as a result of large number of threads in "WAITING" state (threads will fill up the heap). This occurs due to a known JDK issue (JDK-8330017: ForkJoinPool stops executing tasks due to ctl field Release Count (RC) overflow).
For resolution and workaround see NSX is impacted by JDK-8330017: ForkJoinPool stops executing tasks due to ctl field Release Count (RC) overflow.