Application on NSX node <manager node> has crashed. The number of core files found is 1. Collect the Support Bundle including core dump files and contact VMware Support team.
"/image/core
on the manager node noted in the alert named proton_oom.hprof
.ls -la /image/core
or as the admin user by running get core-dump
./etc/init.d/proton status
:# /etc/init.d/proton status
● proton.service - proton: VMware NSX Proton API server
Loaded: loaded (/etc/init.d/proton; enabled; vendor preset: enabled)
Active: active (running) since Mon 2025-01-15 10:54:39 UTC; 1 days ago
Docs: man:systemd-sysv-generator(8)
Main PID: 5166 (wrapper)
Tasks: 1092 (limit: 57708)
Memory: 6.5G
CPU: 1d 1h 10min 57.132s
CGroup: /system.slice/proton.service
├─5166 /usr/tanuki/bin/./wrapper /usr/tanuki/bin/../conf/proton-tomcat-wrapper.conf wrapper.syslog.ident=proton wrapper.pidfile=/var/run/proton/proton.pi…
└─5222 /usr/lib/jvm/openjdk-java11-runtime-amd64/bin/java -Djava.util.logging.config.file=/opt/vmware/proton-tomcat/conf/logging.properties -Djava.util.l…
proton-tomcat-wrapper.log
file on the manager node noted in the alert:INFO | jvm 1 | 2025/01/08 21:12:38 | java.lang.OutOfMemoryError: Java heap space
STATUS | wrapper | 2025/01/08 21:12:38 | The JVM has run out of memory. Requesting thread dump.
STATUS | wrapper | 2025/01/08 21:12:38 | Dumping JVM state.
INFO | jvm 1 | 2025/01/08 21:12:38 | Dumping heap to /image/core/proton_oom.hprof ...
INFO | jvm 1 | 2025/01/08 21:13:18 | Heap dump file created [9470923368 bytes in 39.838 secs]
nsxapi.log
on the manager node noted in the alert:2025-01-15T10:54:35.511Z INFO IdfwCleaner AutoLogoutProcessor 2973510 FIREWALL [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Found active sessions 6472 need to be auto logged out, cut off time 1736909675497
2025-01-15T10:54:35.511Z INFO IdfwCleaner AutoLogoutProcessor 2973510 FIREWALL [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Auto logging out active sessions 1000
2025-01-15T10:54:50.427Z WARN IdfwCleaner ObjectsView 2973510 TXEnd[TX[1e66]] Aborted Exception
org.corfudb.runtime.exceptions.TransactionAbortedException: TX ABORT | Snapshot Time = Token(epoch=31, sequence=506475162) | Failed Transaction ID = 7eb9929f-####-####-####-fa04de3a1e66 | Offending Address = 5064##### | Conflict Key = 36091FEF######## | Conflict Stream = nsx$IdentityIpset | Cause = CONFLICT | Time = 14929 ms
at org.corfudb.runtime.view.StreamsView.append(StreamsView.java:180) ~[?:?]
at org.corfudb.runtime.view.StreamsView.append(StreamsView.java:233) ~[?:?]
at org.corfudb.runtime.view.StreamsView.append(StreamsView.java:244) ~[?:?]
at org.corfudb.runtime.object.transactions.OptimisticTransactionalContext.getConflictSetAndCommit(OptimisticTransactionalContext.java:223) ~[?:?]
at org.corfudb.runtime.object.transactions.WriteAfterWriteTransactionalContext.commitTransaction(WriteAfterWriteTransactionalContext.java:34) ~[?:?]
at org.corfudb.runtime.view.ObjectsView.TXEnd(ObjectsView.java:162) ~[?:?]
at org.corfudb.runtime.collections.TxnContext.commit(TxnContext.java:793) ~[?:?]
at com.vmware.nsx.persistence.UfoTxn.commit(UfoTxn.java:937) ~[?:?]
at com.vmware.nsx.management.container.dao.IdentifiableProxyObjectDao.commit_aroundBody0(IdentifiableProxyObjectDao.java:784) ~[?:?]
at com.vmware.nsx.management.container.dao.IdentifiableProxyObjectDao$AjcClosure1.run(IdentifiableProxyObjectDao.java:1) ~[?:?]
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:149) ~[?:?]
at io.micrometer.core.aop.TimedAspect.processWithTimer(TimedAspect.java:119) ~[?:?]
at io.micrometer.core.aop.TimedAspect.ajc$inlineAccessMethod$io_micrometer_core_aop_TimedAspect$io_micrometer_core_aop_TimedAspect$processWithTimer(TimedAspect.java:1) ~[?:?]
at io.micrometer.core.aop.TimedAspect.timedMethod(TimedAspect.java:97) ~[?:?]
at com.vmware.nsx.management.container.dao.IdentifiableProxyObjectDao.commit(IdentifiableProxyObjectDao.java:781) ~[?:?]
at com.vmware.nsx.management.idfw.processor.AutoLogoutProcessor.autoLogout(AutoLogoutProcessor.java:59) ~[?:?]
at com.vmware.nsx.management.idfw.daemon.IdfwDbDaemon.autoLogoutOldActiveUserSessions(IdfwDbDaemon.java:144) ~[?:?]
at com.vmware.nsx.management.idfw.daemon.IdfwDbDaemon.run(IdfwDbDaemon.java:122) ~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
at java.util.concurrent.FutureTask.runAndReset(Unknown Source) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
at com.vmware.nsx.util.concurrent.Executors$MeteredRunnable.run(Executors.java:353) ~[nsx-util.jar:?]
at com.vmware.nsx.util.concurrent.Executors$MeteredRunnable.run(Executors.java:353) ~[nsx-util.jar:?]
at java.lang.Thread.run(Unknown Source) ~[?:?]
2025-01-15T10:54:50.427Z WARN IdfwCleaner IdentifiableProxyObjectDao 2973510 - [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="manager"] Received TransactionAbortedException from the Corfu client.
2025-01-15T10:54:50.428Z WARN IdfwCleaner IdentifiableProxyObjectDao 2973510 - [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="manager"] com.vmware.nsx.management.container.exceptions.ConcurrentUpd> ateException: STREAM_ID = 71f63950-####-####-####-e8f57d148ee4 | CONFLICT_VALUE = java.lang.Error: Unable to find the corresponding key | CONFLICT_KEY_HASH = 38936784655######## | CONFLICT_KEY = uuid {
left: 12350417914########
right: 105989396920########
}
| MAP_NAME = 71f63950-####-####-####-e8f57d148ee4 | TRANSACTION_ID = 7eb9929f-####-####-####-fa04de3a1e66 | OFFENDING_ADDRESS = 50647####
VMware NSX 4.x (below 4.2.1)
The process responsible for purging old IDFW login/logout events consumes too much memory and causes the proton service to crash
This issue is resolved in VMware NSX 4.2.1.0.