Aria Operations for Logs UI is inaccessible when accessing through the load balancer URL intermittently
search cancel

Aria Operations for Logs UI is inaccessible when accessing through the load balancer URL intermittently

book

Article ID: 434250

calendar_today

Updated On:

Products

VCF Operations

Issue/Introduction

  • Signing into Aria Operations for Logs through the Load Balancer URL intermittently is inaccessible 
  • Uptime shows that one or more node(s) is restarting and status intermittently shows as disconnected when navigating to Management → Cluster in Aria Operations for Logs UI
  • Loginsight services on one or more nodes are flapping between active(running) and inactive (dead) when running the following command on a ssh session as root:
    systemctl status loginsight
  • Running the command df -h on the affected nodes ssh session as root shows that the root partition is above 94% 
  • Open ssh session as root user → Change to the root directory with the command cd .. → Find the file/folder that is the largest by running the command du -hscx * 2>/dev/null | sort -h → Traversing through the largest directory to find that the largest folder is /usr/lib/loginsight/application/lib/apache-cassandra-*/data/hints/→ Running the command ls -lthr in the /usr/lib/loginsight/application/lib/apache-cassandra-*/data/hints/ shows a large amount of .hints and .crc32 files
  • /storage/core/loginsight/var/cassandra.log log extract shows errors similar to: 
    WARN  [main] 2026-03-09T08:17:16,471 SigarLibrary.java:172 - Cassandra server running in degraded mode. Is swap disabled? : false,  Address space adequate? : true,  nofile limit adequate? : true, nproc limit adequate? : true
    INFO  [main] 2026-03-09T08:17:17,305 NativeLibraryLoader.java:213 - /tmp/libnetty_transport_native_epoll_x86_6410101435126854420491.so exists but cannot be executed even when execute permissions set; check volume for "noexec" flag; use -Dio.netty.native.workdir=[path] to set native working directory separately.
    WARN  [main] 2026-03-09T08:17:17,306 NativeTransportService.java:166 - epoll not available
    java.lang.UnsatisfiedLinkError: /tmp/libnetty_transport_native_epoll_x86_6410101435126854420491.so: /tmp/libnetty_transport_native_epoll_x86_6410101435126854420491.so: failed to map segment from shared object
    
  • /storage/core/loginsight/var/cassandra.log shows entries similar to the below entries spamming the logs : 
    INFO  [SSTableBatchOpen:3] 2026-03-09T13:39:00,735 SSTableReaderBuilder.java:354 - Opening /storage/core/loginsight/cidata/cassandra/data/logdb/vimevent_context/nb-#####-big (0.435KiB)
    
  • /storage/core/loginsight/var/runtime.log shows the following error: 
    [2026-03-09 13:49:43.899+0000] ["s4-admin-1"/###.###.###.### WARN] [com.datastax.oss.driver.internal.core.control.ControlConnection] [[s4] Error connecting to Node(endPoint=###.###.###.###:9042, hostId=null, hashCode=#######), trying next node (AnnotatedConnectException: Connection refused: /###.###.###.###:9042)]

Environment

Aria Operations for Logs 8.18.x

Resolution

Note: Take Snapshot for all nodes in the cluster before applying below steps as per KB How to take a Snapshot of Operations for Logs 
  1. Open ssh session as root to all nodes in the cluster.

  2.  Stop Loginsight daemon services on all nodes by running the command:

    systemctl stop loginsight
  3. Make sure watchdog is not running (on all nodes): 

    ps aux | grep loginsight-watchdog
    1. If watchdog is still running you can run the command:

      killall -9 loginsight-watchdog
    2. Or you an kill the process by process id with the command: 

      kill -9 <processid>
  4. Increase the open file limit (on all nodes):

    ulimit -n 100000
  5. Remove the hint and crc32 files on the affected nodes :
    rm -rf /usr/lib/loginsight/application/lib/apache-cassandra-*/data/hints/*
    • Note if you receive the following message "Argument List Too Long" when deleting the files try the following steps: 
      1. Change to the Cassandra hints directory: 
        cd /usr/lib/loginsight/application/lib/apache-cassandra-*/data/hints/

         

      2. Run the following command to delete the hints files : 
        find . -name '*.hints' | xargs rm -f
      3. Run the following command to delete the crc32 files : 
        find . -name '*.crc32' | xargs rm -f
  6. Force start the Cassandra service on all nodes: 
    /usr/lib/loginsight/application/sbin/li-cassandra.sh --startnow --force
  7. Verify Cassandra service status is UN (Up/Normal) for all nodes on all nodes: 
    nodetool-no-pass status
  8. If all nodes show Cassandra is in up (UN status) you can run Flush and Repair on all nodes.  If you see any node in DN state the repair will not complete successfully.  Make sure a node is up and running before proceeding to the next one:

    1. To flush Cassandra run the command: 

      nodetool-no-pass flush
    2. To repair Cassandra run the command: 

      nodetool-no-pass repair
  9. Once the repair is over, stop Cassandra on each node: 
    /usr/lib/loginsight/application/sbin/li-cassandra.sh --stopnow --force
  10. Start Loginsight Services: 
    systemctl start loginsight
  11. Monitor the cluster for flapping of nodes