After an ungraceful shutdown of the DR, it will not restart with the following error or similar being shown;
Trying to stop the Vertica DB via adminTools responds with DB not running. This indicates a process has failed to start/stop gracefully and is running out of sync with the rest of the Vertica process stack.
Looking on DR node 2 () shown in the example above, the spread process is still running (no other nodes are running this):
dradmin 17610 1 0 Mar23 ? 00:21:51 /opt/vertica/spread/sbin/spread -c /mnt/ext4path/catalog/drdata/v_drdata_node0001_catalog/spread.conf -D /opt/vertica/spread/tmp
If all nodes are rebooted ungracefully (for example, a sudden shutdown due to power outage) and then restarted, then it is possible that the spread process has not started correctly and so doesn't allow the system to shutdown and restart correctly.
DX NetOps : CAPM 3.7.x and later
Stop the errant vertica node (in the above example -) via
This will show the following:
This is a standard warning that is shown when trying to stop a host. The data that will be lost, will be that which is currently being processed (i.e it will not affect data already written to the DB). However, it is unlikely that there is any valid data being processed in this scenario since the DB is not functioning properly to accept incoming data from the Data Aggregator (DA).
Once Vertica stops on the host, you can then restart the DB: