Error messages similar to the following appear in /storage/core/loginsight/var/runtime.log file when starting Log Insight Daemon service:
apache-cassandra-3.11.2/bin/cassandra, -f, -R]] java.io.FileNotFoundException: /storage/core/loginsight/cidata/cassandra/data/logdb/alerts-03eedbd481f632a4ab0c04e3c44c041b/.enable_index/mc-37-big-CompressionInfo.db (Too many open files)]
You may also see node showing as disconnected or flapping in the UI and when checking Cassandra logs you see a large amount entries like Opening /storage/core/loginsight/cidata/cassandra/data/logdb/******************/nb-******-big
Aria Operations for Logs 8.12 and later
Cassandra cannot start normally and it need extra ram to start
1. Take a snapshot of all nodes in the cluster without memory and without quiesce
2. Open ssh session as root to all nodes in the cluster
3. Stop Loginsight daemon services by running the command
systemctl stop loginsight
4. Make sure watchdog is not running
ps aux | grep loginsight-watchdog
root 3677 0.0 0.0 4420 732 pts/0 S+ 05:33 0:00 grep --color=auto loginsight-watchdog
If watchdog is still running you can run the command:
killall -9 loginsight-watchdog
Or you an kill the process by process id with the command
kill -9 <processid>
5. Increase the open file limit
ulimit -n 100000
6. Force start Cassandra
/usr/lib/loginsight/application/sbin/li-cassandra.sh --startnow --force
7. To check the status of Cassandra run the command
/usr/lib/loginsight/application/lib/apache-cassandra-x/bin/nodetool-no-pass status
8. If all nodes show Cassandra is in up (UN status) you can run repairing on all nodes. If you see any node in DN state the repair will not complete successfully. Make sure a node is up and running before proceeding to the next one
/usr/lib/loginsight/application/lib/apache-cassandra-x/bin/nodetool-no-pass flush
/usr/lib/loginsight/application/lib/apache-cassandra-x/bin/nodetool-no-pass repair
9. Once repair is over, stop Cassandra and start vRLI Daemon on each node.
/usr/lib/loginsight/application/sbin/li-cassandra.sh --stopnow --force
systemctl start loginsight
10. Repeat the process for each node in the cluster