Error 'KeyError: oid' preventing Vertica database startup after unclean shutdown
search cancel

Error 'KeyError: oid' preventing Vertica database startup after unclean shutdown

book

Article ID: 429360

calendar_today

Updated On:

Products

CA Performance Management Network Observability

Issue/Introduction

Vertica fails to start following an unclean shutdown, failing during the catalog read process with a Python "KeyError: 'oid'" 

SYMPTOMS:

Database fails to start with "Unable to read database catalogs".

AdminTools "Roll Back Database to Last Good Epoch" fails because Epoch.log is missing.

Forced start (admintools -t start_db -F) exits early during catalog initialization.

AdminTools log shows a Traceback in compute_vdatabase.py at self.oid = nodedeets['oid'].

Environment

Database: Vertica 23.4.0.12 

Product: NetOps 24.3.1 

Setup: Single Vertica node in a Disaster Recovery cluster: 2 identical Data Repository environments 

Resolution

PREREQUISITES:

Administrator access (dradmin).
Access to a redundant or healthy Vertica node if performing a copy cluster.

STEPS:

1. ASSESS CATALOG STATUS:

Verify if the catalog directory is empty or if the port is listening.

Command: ss -atupn 

EXPECTED: Port is not listening and /Catalog may be empty if corrupted.

 

2. ATTEMPT CATALOG RECOVERY (OPTIONAL):

Try a forced start to attempt metadata recovery.

Command: /opt/vertica/bin/admintools -t start_db -d [database_name] -F 

NOTE: If this fails with 'KeyError: oid', the catalog is likely too corrupted for local recovery.

 

3. PERFORM COPY CLUSTER FROM REDUNDANT NODE:

If a redundant node or healthy environment exists, use copycluster to restore the catalog and data.

Command: /opt/vertica/bin/vbr.py --task copycluster --config-file /opt/vertica/config/copycluster.ini 

EXPECTED: Data syncs to the destination cluster and reinitializes the catalog.

 

4. COMPLETE REBUILD (IF PREVIOUS STEPS FAIL):


If copycluster fails with "Catalog bootstrap failed", a complete rebuild of the database may be required.

Action: Back up and delete existing /data and /catalog directories, then re-run the restore or copycluster process.

 

 

VERIFY SUCCESS:

Run /opt/vertica/bin/admintools -t list_allnodes.
Confirm the Node State is UP.

Additional Information

It is not recommended to run Data Repository on a single Vertica node.

Run at least 3 nodes cluster to provide redundancy.