Vertica DB / DR member does not come up after unplanned outage

book

Article ID: 194800

calendar_today

Updated On:

Products

CA Infrastructure Management CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

Unix team applied an AV patch that inadvertently rebooted the servers.  

I attempted to restore to the last good epoch, and it was successful on the primary and one of the secondary servers; however, the other secondary server did not start up.  See excerpt from SSH log:

"host ['xx.x.xxx.xx] report: @v_drdata_node0003: VX001/3231: Error on opening file for write [/opt/ca/pm/dr/catalog/drdata/v_drdata_node0003_catalog/Catalog/sequencegenerator.cat]: Operation not permitted
        LOCATION:  serializeGlobalSequenceObject, /scratch_a/release/svrtar14870/vbuild/vertica/Basics/GlobalSequence.cpp:45
Do you want to continue waiting? (yes/no) [yes] yes
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
        Node Status: v_drdata_node0001: (UP) v_drdata_node0002: (UP) v_drdata_node0003: (DOWN) 
Nodes UP: v_drdata_node0002, v_drdata_node0001
Nodes DOWN: v_drdata_node0003 (may be still initializing).
Found these errors in startup.logs on hosts:
host ['xx.x.xxx.xx'] report: @v_drdata_node0003: VX001/3231: Error on opening file for write [/opt/ca/pm/dr/catalog/drdata/v_drdata_node0003_catalog/Catalog/sequencegenerator.cat]: Operation not permitted
        LOCATION:  serializeGlobalSequenceObject, /scratch_a/release/svrtar14870/vbuild/vertica/Basics/GlobalSequence.cpp:45
Do you want to continue waiting? (yes/no) [yes] no
        Server startup was successful on some nodes, but not complete"

Cause

Antivirus

Environment

CAPM 3.x

Resolution

Per our documentation we note the following exclusions should be put in place for the DR when it comes to anti-virus:

https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/it-operations-management/performance-management/3-7/installing/prepare-to-install-the-data-repository.html

To avoid database corruption, exclude the installation directory, and all its subdirectories, from antivirus scans. Prevent scanning by a local instance of an antivirus client and scanning by a remote antivirus instance. Exclude the following directories:

  • /opt/vertica/*
  • /opt/vconsole/*
  • The specified data directory
  • Default: /drdata/data
  • The specified catalog directory
  • Default: /drdata/catalog
  • Vertica temporary files in /tmp
    • /tmp/4803
    • /tmp/vbr/*
  • The directory where you back up the Data Repository

After putting the exclusions in place the vertica database should come up fine