UI is inaccessible as cassandra is down - Aria Operations for Logs
search cancel

UI is inaccessible as cassandra is down - Aria Operations for Logs

book

Article ID: 374952

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

  • Nodes show as disconnected or node connectivity is flapping in the UI.

  • Error messages similar to the following appear in /storage/core/loginsight/var/runtime.log file when starting Log Insight Daemon service:apache-cassandra-3.11.2/bin/cassandra, -f, -R]] java.io.FileNotFoundException: /storage/core/loginsight/cidata/cassandra/data/logdb/alerts-################################/.enable_index/mc-37-big-CompressionInfo.db (Too many open files)] 

 

Environment

Aria Operations for Logs 8.12 and later 

Resolution

1. Take a snapshot of all nodes in the cluster as per How to take a Snapshot of VMware Aria Operations for Logs.

2. Open ssh session as root to all nodes in the cluster.

3. Stop Loginsight daemon services on all nodes by running the command:

systemctl stop loginsight

4. Make sure watchdog is not running (on all nodes)

ps aux | grep loginsight-watchdog

root 3677 0.0 0.0 4420 732 pts/0 S+ 05:33 0:00 grep --color=auto loginsight-watchdog

If watchdog is still running you can run the command:

killall -9 loginsight-watchdog

Or you an kill the process by process id with the command

kill -9 <processid>

5. Increase the open file limit (on all nodes)

ulimit -n 100000

6. Force start Cassandra on each node. (on all nodes)  

/usr/lib/loginsight/application/sbin/li-cassandra.sh --startnow --force

7. To check the status of Cassandra run the command. (on all nodes)

nodetool-no-pass status

8. If all nodes show Cassandra is in up (UN status) you can run repairing on all nodes.  If you see any node in DN state the repair will not complete successfully.  Make sure a node is up and running before proceeding to the next one

nodetool-no-pass flush

nodetool-no-pass repair

9. Once the repair is over, stop Cassandra and start Aria Operations for Logs Daemon on each node

/usr/lib/loginsight/application/sbin/li-cassandra.sh --stopnow --force

systemctl start loginsight