Aria Operations for Logs UI Inaccessible Following Upgrade to 8.18.5 due to Cassandra Filesystem Corruption
search cancel

Aria Operations for Logs UI Inaccessible Following Upgrade to 8.18.5 due to Cassandra Filesystem Corruption

book

Article ID: 435131

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

After upgrading VMware Aria Operations for Logs from version 8.16 to 8.18.5, the User Interface (UI) is inaccessible. The Cassandra service fails to start, and logs show the following error pattern:

ERROR [SSTableBatchOpen:1] SSTableReader.java:562 - Cannot read sstable /storage/core/loginsight/cidata/cassandra/data/system/...; file system error, skipping table
org.apache.cassandra.io.FSReadError: java.nio.charset.MalformedInputException: Input length = 1

Environment

Aria Operations for logs: 8.18.x

Cause

This issue is caused by filesystem corruption on the virtual appliance nodes, leading to corrupted Cassandra SSTable structures or migration metadata. Corruption typically occurs due to:

  • Abrupt system or virtual machine shutdowns.
  • Disk write tasks failing to complete before power-off.
  • Underlying bad blocks on the storage device.

Resolution

Resolution Steps:

Ensure you have a valid backup or offline snapshots of all cluster nodes before proceeding with these steps.
How to take a Snapshot of Operations for Logs

1. Isolate Corrupted SSTables

Identify the corrupted directories mentioned in the Cassandra logs. Move these directories to a temporary location outside of the Cassandra data path to allow the service to initialize while skipping the unreadable tables.

2. Restore Migration Metadata

If the /storage/core/loginsight/cidata/cassandra/migrations directory is corrupted, it must be restored by copying the directory from a healthy peer node in the cluster.

3. Clear Corrupted Log Store

On nodes where corruption is widespread and preventing service startup, clear the local log store using the following steps:

This command will make changes to your system. Review it carefully before running.

  • # Stop the Log Insight service
    service loginsight stop

  • # Navigate to the log store directory
    cd /storage/core/loginsight/cidata/store

  • # Remove all contents within the store directory
    rm -rf *

  • # Restart the service to re-initialize the storage path
    service loginsight start