The upgrade of Aria operations for logs on either the master or worker node fails with error "Failed to run nodetool-no-pass upgradesstables"
search cancel

The upgrade of Aria operations for logs on either the master or worker node fails with error "Failed to run nodetool-no-pass upgradesstables"

book

Article ID: 392282

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • The upgrade process fails at the upgradesstables step when interacting with Cassandra.
  • Although the overall health of Cassandra is reporting green but when we perform the upgrade a specific call 'upgradesstables' is made and this call is reporting exception 'java.io.NotSerializableException: org.apache.cassandra.io.util.File'

2025-03-12 02:59:54,155 upgrade-driver INFO The file /storage/core/upgrade-version is created successfully.
2025-03-12 02:59:54,156 cassandra INFO Start upgrading cassandra sstable schema...
2025-03-12 02:59:56,737 cassandra ERROR Failed to run nodetool-no-pass upgradesstables
2025-03-12 02:59:56,737 cassandra ERROR Exit code: 2
2025-03-12 02:59:56,737 cassandra ERROR out:
2025-03-12 02:59:56,737 cassandra ERROR err: error: org.apache.cassandra.io.util.File

  • The Cassandra logs confirm that the error occurs when the upgrade attempts to run the nodetool-no-pass upgradesstables command. The specific error in the logs indicates a corrupted file in the Cassandra data directory:

ERROR [CompactionExecutor:3] org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /storage/core/loginsight/cidata/cassandra/data/system_distributed/parent_repair_history-#######################/me-937-big-Data.db

Environment

Aria Operations for logs 8.x

Cause

The corruption of the file /storage/core/loginsight/cidata/cassandra/data/system_distributed/parent_repair_history-#######################/me-937-big-Data.db is preventing the successful upgrade of the Cassandra sstables. This corruption is blocking the upgradesstables command from completing.
The corrupted file in the Cassandra data directory is identified in the logs. The error is directly tied to the inability to process this corrupted SSTable file, which results in the failure of the upgrade operation.

Resolution

    1. Take a snapshot of all the nodes part of Aria operations for logs
    2. Stop both the LogInsight and Cassandra services using the following command:

        systemctl stop loginsight

    3. Make sure both loginsight and cassandra services are stopped

        ps -aux | grep cass
        ps -aux | grep login


    4. Remove the corrupted SSTable file:

        rm /storage/core/loginsight/cidata/cassandra/data/system_distributed/parent_repair_history-#######################/me-937-big-Data.db

    5. Start the LogInsight service:

        systemctl start loginsight (Cassandra will be a started as a depended service for loginsight service)

    6. Ensure that Cassandra is up and running by running:

        nodetool-no-pass status

    7. Start the upgrade from UI or manually from the problematic node (Refer to article#344057 for manual upgrade steps)

Note: It is possible that after cleaning up the Corrupted file and retrying the upgrade another corruption comes up. To detect the corrupted file, just look for the above mentioned exception in the cassandra.log file. Inspect the file name and re-apply the above mentioned steps.