Symptoms:
- Many cluster group status become DEGRADED randomly.
- You can see many following entries in tanuki.log, which show cbm faced OOM periodically.
INFO | jvm 13 | 2020/09/03 03:15:40 | java.lang.OutOfMemoryError: Java heap space
STATUS | wrapper | 2020/09/03 03:15:40 | The JVM has run out of memory. Requesting thread dump.
STATUS | wrapper | 2020/09/03 03:15:40 | Dumping JVM state.
STATUS | wrapper | 2020/09/03 03:15:40 | The JVM has run out of memory. Restarting JVM.
INFO | jvm 13 | 2020/09/03 03:15:40 | Dumping heap to /image/core/cbm_oom.hprof ...
- You can see following entry in corfu-compactor-audit.log which shows corfu compactor faced OOM.
2020-09-01T05:15:58.364Z ERROR main FrameworkCorfuCompactor - - [nsx@6876 comp="nsx-manager" errorCode="MP1" level="ERROR" subcomp="corfu-compactor"] Checkpoint failed for framework data with namespace nsx-manager
java.lang.OutOfMemoryError: Java heap space
Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.