If it is determined that a node is faulty and we need to remove and rejoin the node in the cluster, take the following steps.
- In vCenter, take backup snapshots of every appliance in the vRA HA configuration.(Non-Memory)
- From a root command line on any healthy node, run the following:
kubectl get pod `vracli status | jq -r '.databaseNodes[] | select(.["Role"] == "primary") | .["Node name"]' | cut -d '.' -f 1` -n prelude -o wide --no-headers=true
example:
postgres-0 1/1 Running 0 39h 12.123.2.14 vra-vm-224-84.company.com <none> <none>
Important:The primary database node must be one of the healthy nodes. If the primary database node is faulty, contact technical support instead of proceeding.
- From the root command line of the healthy node, remove the faulty node.
vracli cluster remove faulty-node-FQDN
- From the Faulty node, join the vRealize Automation cluster.
vracli cluster join primary-DB-node-FQDN
- Log in as root to the command line of the primary database node.
- Deploy services on the cluster by running the following script.
/opt/scripts/deploy.sh
- Verify by running the command the node is joined and in "Ready" State:
kubectl get nodes