Can one node at a time be stopped for maintenance in a k-safe multi-node cluster to facilitate patching for example
Release : 22.2.x, 23.3
Component : PM Storage
You can stop a single node in a cluster.
As always, it is *HIGHLY* advisable to have a good backup of each CA Performance Management component before stopping the nodes.
1) Backup instructions (note that you may need ot use the version dropdown in the documentation to reference your specific version)
See the note in those instructions:
"Important! The following procedure is the supported method for backing up the Data Repository. Taking a virtual machine snapshot is not a supported method for backing up the Data Repository"
2) To stop a single node:
a) Log in to any node as the dradmin and launch the adminTools UI.
b) Select Option 7 for the Advanced Menu. Select Option 2 to Stop Vertica on Host.
c) Patch the server
d) Before restarting the DR DB for the node via adminTools open a Vsql session on one of the other nodes and run the following:
Run the command as the Data Repository administrator database user, not the Data Aggregator database user. The password used should be the same password used to stop/start the DB via the adminTools UI.
The full commands after bringing down the DB and patching the server, but before restarting the problem node are these:
From the /opt/vertica/bin directory as the Data Repository administrator user run:
./vsql -U
Then run this in the vsql> prompt:
select make_ahm_now(true);
Then, run \q to quit the prompt.
This updates the AHM, and reduces the number of small changes that need to be sent to the down node when it restarts.
e) After the server is patched, up, and operational, and the make_ahm_now command is run in vsql, we can restart Vertica on the node.
Launch adminTools, and from the Main Menu of adminTools select Option 5 to Restart Vertica on Host.