/var/log/vmware/loginsight/cassandra.log
:Caused by: org.apache.cassandra.io.compress.CorruptBlockException: (/storage/core/loginsight/cidata/cassandra/data/machine_learning/spock_cluster_counts-adb55650547611edbb2347b6512511a6/nb-66186-big-Data.db): corruption detected, chunk at 293732 of length 29154.
/usr/lib/loginsight
. Then you find java_pid####.hprof
file is the largest file in this directory.Aria Operations for Logs 8.x
VMware vRealize Log Insight 8.x
hprof
file is generated due to service crashes, which seem to be a trending issue when the cluster is running low on live storage and no data archiving is enabled, or when the cluster is undersized.Workaround:
Part 1. Issue due to /var/log, /var/log/audit
being the largest directories.
Boot into Single User mode to clear the filled log files, and configure log rotation.
Notes:
Note: The virtual appliance starts in single-user mode.
rm /var/log/audit/audit.log
rm /var/log/auth.log*
Note: Skip this step on vRealize Log Insight 8.4 and later.
Note: Skip this step on vRealize Log Insight 8.1 and later.
Note: Skip this step on vRealize Log Insight 8.6 and later.
/var/log/auth.log {
daily
missingok
rotate 5
compress
delaycompress
notifempty
create 640 root root
}
Notes:
if [[ -f /var/log/auth.log && ! -s /var/log/auth.log ]]; then
systemctl restart rsyslog
fi
Example: After editing, the file should look similar to the following.
#!/bin/sh
/usr/sbin/logrotate /etc/logrotate.conf
EXITVALUE=$?
if [ $EXITVALUE != 0 ]; then
/usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
fi
if [[ -f /var/log/auth.log && ! -s /var/log/auth.log ]]; then
systemctl restart rsyslog
fi
exit $EXITVALUE
Part 2. Issue due to hprof
file.
1. SSH into the node as root.
2. Go to /usr/lib/loginsight and remove the hprof file by running the following command:
rm java_pid####.hprof
3. Repeat the above step on any other nodes that have 100% full root partition.
Aria Operations for Logs (Formerly vRealize Log Insight) 8.6 and higher contain a fix to address the log rotation issues. However, this issue may still occur due to excessive logins from network and vulnerability scanners.