"Application on NSX node <node_id> has crashed" alarm triggers on NSX Manager./image/core:-rw------- 1 uproxy uproxy 453M Feb 13 10:40 proxy_oom.hprof
/var/log/proxy/proxy-tomcat-wrapper.log, the below logs may be observed:INFO | jvm 1 | 2025/02/13 10:40:25 | "grpc-default-executor-735532" #1144511 daemon prio=5 os_prio=0 tid=0x00001b82248e0000 nid=0xd9f44 waiting on condition [0x##########ca4000]INFO | jvm 1 | 2025/02/13 10:40:25 | java.lang.Thread.State: WAITING (parking)INFO | jvm 1 | 2025/02/13 10:40:25 | at sun.misc.Unsafe.park(Native Method)INFO | jvm 1 | 2025/02/13 10:40:25 | - parking to wait for <0x############e158> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)INFO | jvm 1 | 2025/02/13 10:40:25 | at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)INFO | jvm 1 | 2025/02/13 10:40:25 | at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
INFO | jvm 1 | 2025/02/13 10:40:31 | "Processing request ########-1d4f-4883-b4a3-############" #1132391 daemon prio=5 os_prio=0 tid=0x##########e3b000 nid=0x3f455d runnable [0x##########389000]INFO | jvm 1 | 2025/02/13 10:40:31 | java.lang.Thread.State: RUNNABLEINFO | jvm 1 | 2025/02/13 10:40:31 | at java.net.SocketInputStream.socketRead0(Native Method)INFO | jvm 1 | 2025/02/13 10:40:31 | at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)...INFO | jvm 1 | 2025/02/13 10:40:31 | at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:717)INFO | jvm 1 | 2025/02/13 10:40:31 | at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:608)INFO | jvm 1 | 2025/02/13 10:40:31 | at com.vmware.nsx.management.rp.security.oauth2.VidmTokenServices.initDiscoveryEndPoint(VidmTokenServices.java:234)
└─$ grep "Processing request" /var/log/proxy/proxy-tomcat-wrapper.log | wc -l
99
└─$ zgrep "grpc-default-" /var/log/proxy/proxy-tomcat-wrapper.log* | wc -l12546
VMware NSX
Root cause of this issue is lack (or slowness) on response from vIDM server (e.g. due to slow network, or vIDM being busy), leading to overload of authentication requests on NSX Manager, which will exhaust JVM of proxy service, and will cause proxy service to run out of memory.
In VMware NSX 4.2.1, available at Broadcom downloads, improvements where introduced to avoid/prevent proxy service running out of memory due to slowness on vIDM side, further improvements are planned for a future version of NSX.
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.
Workaround
#systemctl restart envoy; systemctl --no-pager status envoy | grep Active
Active: active (running) since Mon 2025-03-17 12:52:37 UTC; 12ms ago
>get service http
>get cluster status
If you are contacting Broadcom support about this issue, please provide the following:
Handling Log Bundles for offline review with Broadcom support
If the steps here have not resolved the issue for you, you can refer to the following KB which can provide further troubleshooting steps: