After restarting the cluster, Cassandra will not start on nodes other than the primary node
search cancel

After restarting the cluster, Cassandra will not start on nodes other than the primary node

book

Article ID: 421797

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

After restarting your Aria Operations for Logs cluster, Cassandra will only start on the primary node leaving the web UI inaccessible.  

  • Running the command nodetool-no-pass status only shows the primary node being up (UN status in the first column of the output)
  • Following "FAILED: Unable to get user data. Possible Cassandra is down" - Aria Operations for Logs does not resolve the problem
  • Updating the certificate with SSL certificates are expired for Aria Operations for Logs (Formerly Log Insight) does not resolve the problem
  • /storage/core/loginsight/var/cassandra.log on the primary node shows errors similar to:

    ERROR [main] 2025-09-11T13:04:49,477 CassandraDaemon.java:900 - Port already in use: 7199; nested exception is:
            java.net.BindException: Address already in use (Bind failed)
    java.net.BindException: Address already in use (Bind failed)
            at java.net.PlainSocketImpl.socketBind(Native Method) ~[?:?]
            at java.net.AbstractPlainSocketImpl.bind(Unknown Source) ~[?:?]
            at java.net.ServerSocket.bind(Unknown Source) ~[?:?]
            at java.net.ServerSocket.<init>(Unknown Source) ~[?:?]
            at javax.net.DefaultServerSocketFactory.createServerSocket(Unknown Source) ~[?:?]

Environment

Aria Operations for Logs 8.18.x

Cause

An orphaned Cassandra process is using port 7199 and cannot be stopped by systemctl stop loginsight or /usr/lib/loginsight/application/sbin/li-cassandra.sh --stopnow --force

Resolution

To find and stop the orphaned Cassandra process to allow the proper Cassandra process to start:

  1. Log into the primary Aria Operations for Logs node and run systemctl stop loginsight
  2. Run /usr/lib/loginsight/application/sbin/li-cassandra.sh --stopnow --force
  3. Run ps aux | grep java
  4. Note the process ID (PID) of any java process still running. It will be the number after root in the output, for example:
    root      3312
  5. Run kill -9 <PID> for all running java processes. (There is likely only one). 
  6. Run systemctl start loginsight
  7. Repeat steps 1-6 on the other nodes in sequential order, then run nodetool-no-pass status afterwards to ensure prior nodes now show the UN status before proceeding to the next node.