Data corruption causes Data Repository Vertica Node to crash
search cancel

Data corruption causes Data Repository Vertica Node to crash

book

Article ID: 17362

calendar_today

Updated On:

Products

CA Performance Management Network Observability

Issue/Introduction

A non-clustered vertica node may crash due to corrupt SAL files. SAL (Storage Abstract Layer) refers to the data files. It consists of index files and actual data. An error like this in the vertica.log would be the signature:

2014-09-25 08:46:14.665 Main:0x4e60060 [Init] <INFO> SAL initialization complete
2014-09-25 08:46:14.665 Main:0x4e60060 [SAL] <INFO> Storage Location /data/CADR_DB01/v_cadr_db01_node0001_data Exists
2014-09-25 08:46:27.344 Main:0x4e60060 <WARNING> @v_XXX_db01_node0001: 01000/3938: MiniRos 45035997040846225 does not have proper SAL files
2014-09-25 08:46:27.383 Main:0x4e60060 <PANIC> @v_XXX_db01_node0001: VX001/2973: Data consistency problems found; startup aborted
HINT: Check that all file systems are properly mounted. Also, the --force option can be used to delete corrupted data and recover from the cluster
LOCATION: mainEntryPoint,
/scratch_a/release/vbuild/vertica/Basics/vertica.cpp:1055
2014-09-25 08:46:28.391 Main:0x4e60060 [Main] <PANIC> Wrote backtrace to ErrorReport.txt

Environment

All supported DX NetOps Performance Management releases

Cause

Corrupted data files

Resolution

Running the following command has been shown to fix the corruption and allow you to start the database.

It's run from the (default path) /opt/vertica/bin directory as the dradmin or equivalent user.

./admintools -t start_db -d <DB_Name> -F