Some appliance components are not functioning properly. Component health: SEARCH:UNKNOWN, MANAGER:UNKNOWN, NODE_MGMT:UP, UI:UP. Error code: 101
get cluster status
ls -l /image/core/
total 1191400
-rw------- 1 nsx-cbm nsx-cbm 45579417 Mar 8 15:06 cbm_oom.hprof.gz
-rw------- 1 root root 230343252 Mar 8 20:17 compactor_oom.hprof.gz
-rw------- 1 corfu corfu 944060040 Mar 9 14:14 corfu_oom.hprof.gz
hs_err_pid####.log
####
represents the PID of the process and will be different on your setup./config/corfu/LAYOUT_CURRENT.ds "sequencers": [ "192.168.1.131:9000", "192.168.1.133:9000", "192.168.1.132:9000" ], "segments": [ { "replicationMode": "CHAIN_REPLICATION", "start": 0, "end": 40397089, "stripes": [ { "logServers": [ "192.168.1.131:9000" ] } ] }, { "replicationMode": "CHAIN_REPLICATION", "start": 40397089, "end": 40397196, "stripes": [ { "logServers": [ "192.168.1.131:9000", "192.168.1.133:9000" ] } ] }, { "replicationMode": "CHAIN_REPLICATION", "start": 40397196, "end": 40397804, "stripes": [ { "logServers": [ "192.168.1.131:9000", "192.168.1.133:9000" ] } ] }, { "replicationMode": "CHAIN_REPLICATION", "start": 40397804, "end": -1, "stripes": [ { "logServers": [ "192.168.1.131:9000", "192.168.1.133:9000", "192.168.1.132:9000
Manager 10.1.1.133 is missing from replication:
"start": 0,
"end": 40397089,
Manager 192.168.1.132 is missing from replication:
"start": 40397089,
"end": 40397196,
And replication:
"start": 0,
"end": 40397089,
Manager 192.168.1.131 is the only one with a complete database.
/var/log/corfu/corfu-compactor-audit.log
, you see:corfu-compactor-audit.9.log:2022-03-02T18:54:23.170Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] UFO Trim completed, elapsed(0s), log address up to 24113825 (exclusive).
corfu-compactor-audit.9.log:2022-03-02T19:09:22.956Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] UFO Trim completed, elapsed(0s), log address up to 24163751 (exclusive).
corfu-compactor-audit.log:2022-03-07T15:53:56.666Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] UFO Trim completed, elapsed(0s), log address up to 28012247 (exclusive).
...
corfu-compactor-audit.log:2022-03-09T12:57:25.040Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] UFO Trim completed, elapsed(0s), log address up to 28012247 (exclusive).
corfu-compactor-audit.log:2022-03-09T13:57:11.401Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] UFO Trim completed, elapsed(0s), log address up to 28012247 (exclusive).
corfu-compactor-audit.log:2022-03-09T14:42:42.964Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] UFO Trim completed, elapsed(0s), log address up to 28012247 (exclusive).
corfu-compactor-audit.log:2022-03-09T14:59:07.533Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] UFO Trim completed, elapsed(0s), log address up to 28012247 (exclusive).
corfu-compactor-audit.log:2022-03-09T15:15:43.943Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] UFO Trim completed, elapsed(4s), log address up to 28012247 (exclusive).
2022-03-03T15:29:29.765Z INFO main CheckpointWriter - appendCheckpoint: completed checkpoint for 3c54c60e-####-####-####-f03724af9649, entries(621841), cpSize(213995471) bytes at snapshot Token(epoch=13, sequence=27948473) in 934074 ms
2022-03-03T16:57:20.892Z INFO main CheckpointWriter - appendCheckpoint: completed checkpoint for 3c54c60e-####-####-####-f03724af9649, entries(619376), cpSize(213145572) bytes at snapshot Token(epoch=13, sequence=28017532) in 4606838 ms
2022-03-03T17:33:09.152Z INFO main UfoCompactor - - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="corfu-compactor"] Starting checkpoint namespace: nsx, tableName: LoginLogoutEvent
2022-03-03T17:33:09.152Z INFO main MultiCheckpointWriter - appendCheckpoints: appending checkpoints for 1 maps
2022-03-03T17:33:09.164Z INFO main CheckpointWriter - appendCheckpoint: Started checkpoint for 3c54c60e-####-####-####-f03724af9649 at snapshot Token(epoch=13, sequence=28250089)
......
Aborting due to java.lang.OutOfMemoryError: Java heap space
......
Aborted (core dumped)
2022-03-03T17:47:22.761Z INFO Runner - Failed to run compactor tool: Command 'MALLOC_TRIM_THRESHOLD_=1310720 nice -n -10 java -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc:/var/log/corfu/compactor-gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1M -XX:+UseStringDeduplication -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/image/core/compactor_oom.hprof -XX:OnOutOfMemoryError="gzip -f /image/core/compactor_oom.hprof" -XX:+CrashOnOutOfMemoryError -Xms963m -Xmx963m -Djava.io.tmpdir=/image/corfu-tools/temp -Djdk.nio.maxCachedBufferSize=1048576 -Dio.netty.recycler.maxCapacityPerThread=0 -DlogFilePrefix=/var/log/corfu/corfu-compactor-audit -Dlog4j.configurationFile=/opt/vmware/ufo-tools/corfu-compactor-log4j2.xml -Dcorfu-property-file-path=/opt/vmware/cbm/etc/ufo-factory.properties -cp "/opt/vmware/ufo-tools/*" com.vmware.nsx.platform.ufo.UfoCompactorMain -hostname 10.1.1.132 -hostname 10.1.1.133 -hostname 10.1.1.131 -port 9000 -trim -useDistributedLock -lockCorfuHostname 10.1.1.131 -lockCorfuPort 9000 -bulkReadSize 50' returned non-zero exit status 134.
grep "| java.lang.OutOfMemoryError: Java heap space" tanuki.log | head
INFO | jvm 1 | 2022/03/04 20:45:02 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 2 | 2022/03/07 03:01:18 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 3 | 2022/03/07 12:02:34 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 4 | 2022/03/07 18:21:28 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 5 | 2022/03/07 19:44:54 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 6 | 2022/03/07 20:47:07 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 7 | 2022/03/08 03:36:20 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 8 | 2022/03/08 05:35:02 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 9 | 2022/03/08 08:10:50 | java.lang.OutOfMemoryError: Java heap space
INFO | jvm 10 | 2022/03/08 11:18:34 | java.lang.OutOfMemoryError: Java heap space
VMware NSX-T Data Center 3.x
You are using IDFW and have log scrapping configured AD and Aria Operations for Logs.
This issue is resolved in VMware NSX-T Data Center 3.2.2.0
This issue is resolved in VMware NSX-T Data Center 3.2.3.0
This issue is resolved in VMware NSX 4.0.1.1
Workaround:
It is possible to increase the intensity of the IDFW cleaner to start more often and cleanup these entries, thus reducing the retention time of the events in the corfu table.
If you believe you have encountered this issue, please open a support request and reference this KB.