After an ungraceful shutdown of the DR, it will not restart with the following error or similar being shown;
** Starting database: drdata ***
Error: the vertica process for the database is running on the following hosts:
This may be because the process has not completed previous shutdown activities. Please wait and retry again.
Database start up failed. Processes still running.
Trying to stop the Vertica DB via adminTools responds with DB not running. This indicates a process has failed to start/stop gracefully and is running out of sync with the rest of the Vertica process stack.
DX NetOps : CAPM 3.7.x and later
Looking on DR node 2 (v_drdata_node0002) shown in the example above, the spread process is still running (no other nodes are running this):
dradmin 17610 1 0 Mar23 ? 00:21:51 /opt/vertica/spread/sbin/spread -c /mnt/ext4path/catalog/drdata/v_drdata_node0001_catalog/spread.conf -D /opt/vertica/spread/tmp
If all nodes are rebooted ungracefully (for example, a sudden shutdown due to power outage) and then restarted, then it is possible that the spread process has not started correctly and so doesn't allow the system to shutdown and restart correctly.
Stop the errant vertica node (in the above example - v_drdata_node0002) via
/opt/vertica/bin/adminTools -> Advanced Menu -> Stop Vertica on Host
This will show the following:
This is a standard warning that is shown when trying to stop a host. The data that will be lost, will be that which is currently being processed (i.e it will not affect data already written to the DB). However, it is unlikely that there is any valid data being processed in this scenario since the DB is not functioning properly to accept incoming data from the Data Aggregator (DA).
Once Vertica stops on the host, you can then restart the DB: