Resuming rolling upgrade of a Aria Operations for Logs cluster
search cancel

Resuming rolling upgrade of a Aria Operations for Logs cluster

book

Article ID: 315957

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

  • VMware Aria Operations for Logs( formerly vRealize Log Insight) has the ability to upgrade all nodes of a Log Insight cluster sequentially. If a cluster-wide upgrade fails part-way through, the upgrade should roll back to a pre-upgraded state.
  • Attempting to upgrade vRealize Log Insight from the UI, display get the error "Upgrade already in progress".
  • If rollback also fails, such as from an infrastructure problem, the cluster is left in a partially-upgraded state. This article provides steps to manually resume the upgrade.

Environment

Aria Operations for Logs 8.x (formerly vRealize Log Insight)

Resolution

To disable automatic cluster-wide upgrade and manually resume upgrade of a cluster, follow the steps below.

  1. Open a console or SSH session to the primary node of the Log Insight cluster and log in as root. 
  2. Open the Cassandra CQL shell using one of the following options outlined in Accessing the Cassandra Database in Aria Operations for Logs
  3. At the cqlsh> prompt, select the logdb keyspace using this command:
use logdb;
  1. Determine the status of all historic upgrades using this command:
select version, status from upgrade_status;

Example output:
version | status
--------------+-----------------
2.5.1-1234567 | Complete
2.6.3-2345678 | CreatingSnapshot
  1. Note the highest version number from this outputted list.
  2. Mark the upgrade as failed using this command:
update upgrade_status set status='Failed' where version = 'version_number';

Note: Replace version_number with the version number noted in step 8.
  1. Determine the cluster nodes which were in-progress upgrading to this version:
select node_id, status from node_upgrade_status where version = 'version_number';

Note: Replace version_number with the version number noted in step 8.

Example output:
node_id | status
-------------------------------------+-----------------
00e80c91-41a5-####-####-########bb3 | Pending
167def1a-8034-####-####-########c5f | CreatingSnapshot
1d514478-aff1-####-####-########2be | Pending
  1. Note the node_id of any nodes with a status of pending.
  2. Set the status of all cluster nodes from step 11 to Failed:
update node_upgrade_status set status = 'Failed' where version = 'version_number' and node_id in ('none_id1','node_id2');

Note: Replace version_number, with the version number from step 8, and replace node_id1, node_id2, etc, with the node IDs found in step 11.

Exampleupdate node_upgrade_status set status = 'Failed' where version = '2.6.3-2345678' and node_id in ('00e80c91-41a5-####-####-########bb3','1d514478-aff1-####-####-########2be')
 

Additional Information