- Access to the NSX-T manager GUI fails or is intermittent.
- REST API calls to the NSX-T manager may fail.
- A new NAT (SNAT or DNAT) rule was added with address: 0.0.0.0/0.
- The CPU of the NSX-T manager is high.
- In the NSX-T manager CLI when you run the command 'get cluster status'
You will notice the HTTPS (reverse-proxy) and MANAGER (proton) service is down randomly on different managers, see example below:
Group Type: MANAGER
Group Status: DEGRADED
Members:
UUID FQDN IP STATUS
3a6c0842-####-####-####-##########6f manager01 192.168.110.51 UP
6dd22d42-####-####-####-##########47 manager02 192.168.110.52 UP
0b172d42-####-####-####-##########bb manager03 192.168.110.53 DOWN
Group Type: HTTPS
Group Status: DEGRADED
Members:
UUID FQDN IP STATUS
3a6c0842-####-####-####-##########6f manager01 192.168.110.51 UP
6dd22d42-####-####-####-##########47 manager02 192.168.110.52 UP
0b172d42-####-####-####-##########bb manager03 192.168.110.53 DOWN
Then a few minutes later it may flip to other managers:
Group Type: MANAGER
Group Status: DEGRADED
Members:
UUID FQDN IP STATUS
3a6c0842-####-####-####-##########6f manager01 192.168.110.51 UP
6dd22d42-####-####-####-##########47 manager02 192.168.110.52 DOWN
0b172d42-####-####-####-##########bb manager03 192.168.110.53 UP
Group Type: HTTPS
Group Status: DEGRADED
Members:
UUID FQDN IP STATUS
3a6c0842-####-####-####-##########6f manager01 192.168.110.51 UP
6dd22d42-####-####-####-##########47 manager02 192.168.110.52 DOWN
0b172d42-####-####-####-##########bb manager03 192.168.110.53 UP
You may see log entries in the log '/var/log/proton/proton-tomcat-wrapper.log' which indicate a Java Out of Memory issue:
STATUS | wrapper | 2020/02/03 13:31:31 | The JVM has run out of memory. Requesting thread dump.
STATUS | wrapper | 2020/02/03 13:31:31 | Dumping JVM state.
STATUS | wrapper | 2020/02/03 13:31:31 | The JVM has run out of memory. Restarting JVM.
INFO | jvm 1 | 2020/02/03 13:31:31 | Dumping heap to /image/core/proton_oom.hprof ...
ERROR | wrapper | 2020/02/03 13:32:06 | Shutdown failed: Timed out waiting for signal from JVM.
STATUS | wrapper | 2020/02/03 13:32:06 | Dumping JVM state..
STATUS | wrapper | 2020/02/03 13:44:34 | The JVM has run out of memory. Restarting JVM.
INFO | jvm 2 | 2020/02/03 13:44:34 | Dumping heap to /image/core/proton_oom.hprof ...
INFO | jvm 2 | 2020/02/03 13:44:34 | Unable to create /image/core/proton_oom.hprof: File exists
ERROR | wrapper | 2020/02/03 13:45:08 | Shutdown failed: Timed out waiting for signal from JVM.
STATUS | wrapper | 2020/02/03 13:45:08 | Dumping JVM state.
ERROR | wrapper | 2020/02/03 13:45:13 | JVM did not exit on request, termination requested.
STATUS | wrapper | 2020/02/03 13:45:13 | JVM received a signal SIGKILL (9).
STATUS | wrapper | 2020/02/03 13:45:13 | JVM process is gone.