Vertica node crashed in DX NetOps Data Repository

book

Article ID: 48638

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

In a multi-node ksafe Vertica cluster one of the nodes has crashed. The exception in the vertica.log shows the following.

2013-08-19 08:09:05.086 Main:0x1ded8080 <PANIC> @v_repository_node0003: VX001/2973: Data consistency problems found; startup aborted
HINT: Check that all file systems are properly mounted. Also, the
--force option can be used to delete corrupted data and recover from the cluster
LOCATION: mainEntryPoint,
/scratch_a/release/vbuild/vertica/Basics/vertica.cpp:1055
2013-08-19 08:09:05.142 Main:0x1ded8080 [Main] <PANIC> Wrote backtrace to
 ErrorReport.txt

Environment

All supported DX NetOps Performance Management releases

Resolution

In order to force the recovery of the database on the problem node, run the following command. Run it as the database user (default: dradmin) from the (default path) /opt/vertica/bin directory.

./admintools -t restart_node -d <DATABASE_NAME> -p <PASSWORD> -F -s <IP_Address_of_Node

Open adminTools and choose option #1 to "View Database Cluster State" the status should be "Recovering" This process may take a significant amount of time depending on the size of the database but once it is complete you will be able to successfully start the database on the node.

Note: you can run ./adminTools -h to get a full list of adminTools options.