Resuming rolling upgrade of a Aria Operations for Logs cluster
search cancel

Resuming rolling upgrade of a Aria Operations for Logs cluster

book

Article ID: 315957

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

  • VMware Aria Operations for Logs (formerly vRealize Log Insight) has the ability to upgrade all nodes of a Log Insight cluster sequentially. If a cluster-wide upgrade fails part-way through, the upgrade should roll back to a pre-upgraded state.
  • Attempting to upgrade vRealize Log Insight from the UI, display get the error "Upgrade already in progress".
  • If rollback also fails, such as from an infrastructure problem, the cluster is left in a partially-upgraded state. This article provides steps to manually resume the upgrade.

Environment

Aria Operations for Logs 8.x (formerly vRealize Log Insight)

Resolution

To disable automatic cluster-wide upgrade and manually resume upgrade of a cluster, follow the steps below:

  • Open a console or SSH session to the primary node of the Log Insight cluster and log in as root. 
  • Open the Cassandra CQL shell using one of the following options outlined in Accessing the Cassandra Database in Aria Operations for Logs
  • At the cqlsh> prompt, select the logdb keyspace using this command:
use logdb;
  • Determine the status of all historic upgrades using this command:
select version, status from upgrade_status;

Example output:
version | status
--------------+-----------------
2.5.1-1234567 | Complete
2.6.3-2345678 | CreatingSnapshot
  • Note the highest version number from this outputted list.
  • Mark the upgrade as failed using this command:
update upgrade_status set status='Failed' where version = 'version_number';

Note: Replace version_number with the version number.
  • Determine the cluster nodes which were in-progress upgrading to this version:
select node_id, status from node_upgrade_status where version = 'version_number';

Note: Replace version_number with the version number.

Example output:
node_id | status
-------------------------------------+-----------------
00e80c91-41a5-####-####-########bb3 | Pending
167def1a-8034-####-####-########c5f | CreatingSnapshot
1d514478-aff1-####-####-########2be | Pending
  • Note the node_id of any nodes with a status of pending.
  • Set the status of all cluster nodes to Failed:
update node_upgrade_status set status = 'Failed' where version = 'version_number' and node_id in ('none_id1','node_id2');

Note: Replace version_number, with the version number, and replace node_id1, node_id2, etc, with the node IDs.

Exampleupdate node_upgrade_status set status = 'Failed' where version = '2.6.3-2345678' and node_id in ('00e80c91-41a5-####-####-########bb3','1d514478-aff1-####-####-########2be')
 

Additional Information