In a 3 or multi-node Vertica cluster, one of the nodes is unable to start, even though the others are up and running.
The following may be observed in the Vertica log, located at:
<CATALOG-PATH>/drdata/v_drdata_node<NUMBER>_catalog/vertica.log,
The following is seen at the end of the log:
HINT: Check that all file systems are properly mounted. Also, the --force option can be used to delete corrupted data and recover from the cluster
The problem node is on a VM. The VM team found it unresponsive and performed a vmotion server image move to resolve it.
All supported versions of DX NetOps Performance Management
Unexpected server reboots or activity like VM vmotion server movement, without first properly shutting the DB down on the node, will result in outages like this.
As the database administrator user (default is: dradmin), do the following on the DR:
cd /opt/vertica/bin
./admintools -t restart_node --host=<NODE_IP_ADDRESS> -d <DB_NAME> --force./admintools -t restart_node --host=xxx.xxx.xxx.xxx -d drdata --forceInfo: no password specified, using none
*** Restarting nodes for database drdata ***
Restarting host [xxx.xxx.xxx.xxx] with catalog [v_drdata_node0003_catalog]
Issuing multi-node restart
Starting nodes:
v_drdata_node0003 (xxx.xxx.xxx.xxx)
Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
Node Status: v_drdata_node0003: (DOWN)
Node Status: v_drdata_node0003: (DOWN)
Node Status: v_drdata_node0003: (DOWN)
Node Status: v_drdata_node0003: (DOWN)
Node Status: v_drdata_node0003: (DOWN)
Node Status: v_drdata_node0003: (DOWN)
Node Status: v_drdata_node0003: (RECOVERING)
Node Status: v_drdata_node0003: (RECOVERING)
Node Status: v_drdata_node0003: (UP)
# ./admintools -t list_allnodes
Node | Host | State | Version | DB
-------------------+-------------------+-------+------------------+--------
v_drdata_node0001 | xxx.xxx.xxx.xxx | UP | vertica-10.1.1.0 | drdata
v_drdata_node0002 | xxx.xxx.xxx.xxx | UP | vertica-10.1.1.0 | drdata
v_drdata_node0003 | xxx.xxx.xxx.xxx | UP | vertica-10.1.1.0 | drdata
If it is still unable to start, there may be disk partition issues or other issues mentioned in the vertica.log that must be resolved before restarting.
What should I do when the database node is down?
https://www.vertica.com/blog/what-should-i-do-when-the-database-node-is-down/