Cluster fails to change to an Offline State
search cancel

Cluster fails to change to an Offline State

book

Article ID: 337142

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

Attempts to bring the cluster offline with the GS casa script does not update all fields in some cases. The references included in the resolution section are provided as examples of what an offline cluster state should look like within the /storage/db/casa/webapp/hsqldb/casa.db.script file. Note that the script referenced is not public, and need a Broadcom SR.

Comparing the entries below to your environment can be helpful with identifying potential issues with casa.db.script entries.

This article serves as a reference for a good configuration when you need to review the formatting of the clusterMembership section of the casa.db.script for the three most common configurations of Aria Operations.

Environment

VMware vRealize Operations 8.x
VMware Aria Operations 8.x

Cause

Invalid casa entries are generally caused by environmental issues such as network/storage outages that prevents the file from being correctly updated and synchronized.

Resolution

Do not make changes to the casa.db.script file, or issue commands from this, or other KBs without first taking snapshots of the cluster nodes.

Please be aware of the following:

  • After attempting to bring a cluster offline, it fails to reflect an offline state in the admin UI.
  • If you have the option to force cluster offline from the Admin UI, please try this before updating casa.db.script. Force offline option normally appears after a failed 'Take Offline' action has been performed.
  • Further attempts to bring the cluster offline with the below command, may also fail:
    $VMWARE_PYTHON_BIN /usr/lib/vmware-vcopssuite/utilities/sliceConfiguration/bin/vcopsConfigureRoles.py --action=bringSliceOffline --offlineReason=Bringoffline
  • Always remember to stop casa with "service vmware-casa stop" before making any changes, as editing casa.db.script file without stopping casa will not persist the changes. This will stop the Admin UI, as CASA service provides the underlying services for the Admin UI.

You can reference the below entries for a CA enabled cluster, HA enabled cluster and a Non-CA/HA enabled cluster that are in an offline state. These references serves as examples of good configurations of the clusterMembership section in casa.db.script.

The output below has been formatted with the following command to make it easier to read. This may also determine if there are formatting problems with the clusterMembership line:

sed -nre "/clusterMembership/ s/^[^']+'([^']+)','([^']+)'.*/\2/p" /storage/db/casa/webapp/hsqldb/casa.db.script | python -m json.tool

 

1. CA Enabled Cluster. The cluster consists of two fault domains, with a primary node in fault domain 1, a replica in fault domain 2, and a witness node.

{
    "onlineState": "OFFLINE",
    "cluster_name": "My-CA-Cluster",
    "is_ha_enabled": false,
    "ha_transition_state": null,
    "ca_state": "ENABLED",
    "cassandra_ca_state": "ENABLED",
    "initialization_state": "NONE",
    "remove_node_state": "NONE",
    "document_version": 107,
    "document_time": 1622969827933,
    "online_state": "OFFLINE",
    "online_state_time": 1622969827849,
    "online_state_reason": "Snapshots",
    "out_of_diskspace_slice": "",
    "email": null,
    "cluster_members": [],
    "admin_slices": [],
    "installation_state": "DONE",
    "fail_going_offline": false,
    "slices": {
        "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX": {
            "slice_uuid": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
            "pair_uuid": "YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY",
            "is_admin_node": true,
            "ip_address": "192.168.0.1",
            "preferred_addresses": {},
            "slice_name": "PRIMARY_NODE",
            "membership_state": null,
            "region": "REGION_A"
        },
        "ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ": {
            "slice_uuid": "ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ",
            "pair_uuid": null,
            "is_admin_node": false,
            "ip_address": "192.168.200.1",
            "preferred_addresses": {},
            "slice_name": "WITNESS_NODE",
            "membership_state": null,
            "region": null
        },
        "YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY": {
            "slice_uuid": "YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY",
            "pair_uuid": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
            "is_admin_node": false,
            "ip_address": "192.168.100.1",
            "preferred_addresses": {},
            "slice_name": "REPLICA_NODE",
            "membership_state": null,
            "region": "REGION_B"
        }
    }
}


Unformatted clusterMembership line in casa.db.script:

INSERT INTO CASA_DOCS VALUES('clusterMembership','{"onlineState":"OFFLINE","cluster_name":"My-CA-Cluster","is_ha_enabled":false,"ha_transition_state":null,"ca_state":"ENABLED","cassandra_ca_state":"ENABLED","initialization_state":"NONE","remove_node_state":"NONE","document_version":107,"document_time":1622969827933,"online_state":"OFFLINE","online_state_time":1622969827849,"online_state_reason":"Snapshots","out_of_diskspace_slice":"","email":null,"cluster_members":[],"admin_slices":[],"installation_state":"DONE","fail_going_offline":false,"slices":{"XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX":{"slice_uuid":"XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","pair_uuid":"YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY","is_admin_node":true,"ip_address":"192.168.0.1","preferred_addresses":{},"slice_name":"PRIMARY_NODE","membership_state":null,"region":"REGION_A"},"ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ":{"slice_uuid":"ZZZZZZZZ-ZZZZ-ZZZZ-ZZZZ-ZZZZZZZZZZZZ","pair_uuid":null,"is_admin_node":false,"ip_address":"192.168.200.1","preferred_addresses":{},"slice_name":"WITNESS_NODE","membership_state":null,"region":null},"YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY":{"slice_uuid":"YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY","pair_uuid":"XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","is_admin_node":false,"ip_address":"192.168.100.1","preferred_addresses":{},"slice_name":"REPLICA_NODE","membership_state":null,"region":"REGION_B"}}}')

 

2. HA Enabled Cluster in an offline state. The cluster consists of a Master and Replica node:

{
    "onlineState": "OFFLINE",
    "cluster_name": "My-HA-Cluster",
    "is_ha_enabled": false,
    "ha_transition_state": "NONE",
    "ca_state": "DISABLED",
    "cassandra_ca_state": "DISABLED",
    "initialization_state": "NONE",
    "remove_node_state": "NONE",
    "document_version": 29,
    "document_time": 1622972843760,
    "online_state": "OFFLINE",
    "online_state_time": 1622972732438,
    "online_state_reason": "Snapshots",
    "out_of_diskspace_slice": "",
    "email": null,
    "cluster_members": [],
    "admin_slices": [],
    "installation_state": "DONE",
    "fail_going_offline": false,
    "slices": {
        "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX": {
            "slice_uuid": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
            "pair_uuid": null,
            "is_admin_node": true,
            "ip_address": "192.168.0.1",
            "preferred_addresses": {},
            "slice_name": "PRIMARY_NODE",
            "membership_state": null,
            "region": null
        },
        "YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY": {
            "slice_uuid": "YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY",
            "pair_uuid": null,
            "is_admin_node": false,
            "ip_address": "192.168.0.2",
            "preferred_addresses": {},
            "slice_name": "REPLICA_NODE",
            "membership_state": null,
            "region": null
        }
    }
}

 

Unformatted clusterMembership line in casa.db.script:

INSERT INTO CASA_DOCS VALUES('clusterMembership','{"onlineState":"OFFLINE","cluster_name":"My-HA-Cluster","is_ha_enabled":false,"ha_transition_state":"NONE","ca_state":"DISABLED","cassandra_ca_state":"DISABLED","initialization_state":"NONE","remove_node_state":"NONE","document_version":29,"document_time":1622972843760,"online_state":"OFFLINE","online_state_time":1622972732438,"online_state_reason":"Snapshots","out_of_diskspace_slice":"","email":null,"cluster_members":[],"admin_slices":[],"installation_state":"DONE","fail_going_offline":false,"slices":{"XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX":{"slice_uuid":"XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","pair_uuid":null,"is_admin_node":true,"ip_address":"192.168.0.1","preferred_addresses":{},"slice_name":"PRIMARY_NODE","membership_state":null,"region":null},"YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY":{"slice_uuid":"YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY","pair_uuid":null,"is_admin_node":false,"ip_address":"192.168.0.2","preferred_addresses":{},"slice_name":"REPLICA_NODE","membership_state":null,"region":null}}}')

 

3. Non-CA/HA enabled cluster. The cluster consists of a Master and Data node:

{
    "onlineState": "OFFLINE",
    "cluster_name": "My-Standalone-Cluster",
    "is_ha_enabled": false,
    "ha_transition_state": null,
    "ca_state": "DISABLED",
    "cassandra_ca_state": "DISABLED",
    "initialization_state": "NONE",
    "remove_node_state": "NONE",
    "document_version": 27,
    "document_time": 1622972732211,
    "online_state": "OFFLINE",
    "online_state_time": 1622972732438,
    "online_state_reason": "Snapshots",
    "out_of_diskspace_slice": "",
    "email": null,
    "cluster_members": [],
    "admin_slices": [],
    "installation_state": "DONE",
    "fail_going_offline": false,
    "slices": {
        "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX": {
            "slice_uuid": "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX",
            "pair_uuid": null,
            "is_admin_node": true,
            "ip_address": "192.168.0.1",
            "preferred_addresses": {},
            "slice_name": "PRIMARY_NODE",
            "membership_state": null,
            "region": null
        },
        "YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY": {
            "slice_uuid": "YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY",
            "pair_uuid": null,
            "is_admin_node": false,
            "ip_address": "192.168.0.2",
            "preferred_addresses": {},
            "slice_name": "DATA_NODE",
            "membership_state": null,
            "region": null
        }
    }
}

 

Unformatted clusterMembership line in casa.db.script:

INSERT INTO CASA_DOCS VALUES('clusterMembership','{"onlineState":"OFFLINE","cluster_name":"My-Standalone-Cluster","is_ha_enabled":false,"ha_transition_state":null,"ca_state":"DISABLED","cassandra_ca_state":"DISABLED","initialization_state":"NONE","remove_node_state":"NONE","document_version":27,"document_time":1622972732211,"online_state":"OFFLINE","online_state_time":1622972732438,"online_state_reason":"Snapshots","out_of_diskspace_slice":"","email":null,"cluster_members":[],"admin_slices":[],"installation_state":"DONE","fail_going_offline":false,"slices":{"XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX":{"slice_uuid":"XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX","pair_uuid":null,"is_admin_node":true,"ip_address":"192.168.0.1","preferred_addresses":{},"slice_name":"PRIMARY_NODE","membership_state":null,"region":null},"YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY":{"slice_uuid":"YYYYYYYY-YYYY-YYYY-YYYY-YYYYYYYYYYYY","pair_uuid":null,"is_admin_node":false,"ip_address":"192.168.0.2","preferred_addresses":{},"slice_name":"DATA_NODE","membership_state":null,"region":null}}}')

Additional Information