The purpose of the KB is to help troubleshoot the HA to fix cluster stability.
Symptoms:
The High Availability in Aria Operations is stuck in the "Failed to Deactivate", "Activating" or "Failed to disable" state in the Admin UI which restricts any configuration change in the cluster like Certificate Renewal.
The Activate/Deactivate HA button is greyed out.
The cluster can still be in an Online state but it will not perform precisely.
Environment
VMware Aria Operations 8.x
Cause
It can happen if the services fail to either Activate or Deactivate the HA.
Stop casa on all the nodes by running the command: service vmware-casa stop
Open casa.db.script in text edit by using the command: vi /data/db/casa/webapp/hsqldb/casa.db.script
Identify the ha_transition_state in the last line of the file
Change the value from "FAILED_TO_DEACTIVATE' or 'ENABLING' to "NONE" on all the nodes. (Please note the above screenshot has been taken from Aria Operations version 8.12. The value will be "FAILED_TO_DISABLE" in earlier versions.)
Save and close the file.
Start casa on all the nodes by running the command: service vmware-casa start
Bring the cluster online from the admin UI.
Delete the snapshots on all nodes once verified that everything is working as expected.