Data Aggregator fails to stay running
search cancel

Data Aggregator fails to stay running

book

Article ID: 373940

calendar_today

Updated On:

Products

Network Observability CA Performance Management

Issue/Introduction

Data Aggregator fails to stay running. It starts for a moment before failing.

Data Aggregator is not starting after a scheduled reboot of the server. The server was rebooted as per recommended procedure but still facing an issue.

DA is unable to connect to Portal. It's failing to synchronize with Portal as a Data Source.

In the (default path shown) /opt/IMDataAggregator/apache-karaf/data/log/karaf.log file the following ERROR message is seen. The bolded text is the key. It may show different tables with the duplication depending on the system.

ERROR | eLoader-thread-2 | 2024-11-13T21:02:32,669 | ExceptionLog | .ca.im.core.util.ExceptionLogger   99 | m.ca.im.common.core.util |       | A NEW application exception occurred (Key=986b7dcef703dec156db51c235cbc1e078873f32) : Failed to load poll_item attribute data. : StatementCallback; SQL [select m.item_id, m.is_filtered from poll_item m inner join item i on i.item_id = m.item_id]; [Vertica][VJDBC](3149) ERROR: Duplicate primary/unique key detected in join [(dauser.item x dauser.poll_item) using item_super and poll_item_order_by_device_item_id_first_seg_b0 (PATH ID: 1)]; value [73519];

Environment

All supported Network Observability DX NetOps Performance Management Data Aggregator releases

Cause

An item is duplicated in the Data Repository DB. The error is severe enough to prevent successful Data Aggregator startup.

Resolution

Following are the steps to identify the duplicate and delete the Item ID's from vertica database.

Gather the following details. Provide them to support in a new support case for further assistance with problem resolution.

  1. Run the following VSql query to identify the duplicates in poll_item table.
    • Enter the vsql prompt on the DR DB as the dauser.
    • Run the following to determine the number of duplicates and their Item_ID values.
      • select * from poll_item where item_id in (select item_id from dauser.poll_item group by 1 having count(*) >1 ) order by 1;
    • Run the following to determine the epoch they are related to. This is a sample for a single Item_ID value.
      • select *, epoch from dauser.poll_item where item_id=8345224;
      • Contact support for assistance with DB queries to determine this for a list of Item_ID values.
  2. Run the etlHealth.sh script from the (default path) /opt/CA/IMDataRepository_vertica23 directory.
    1. Sample command using documented defaults dauser and dapass.
      • ./etlHealth dauser dapass
      • The command will fail and produce CLI output about what steps to take.
      • The CLI output is reproduced in a log file referenced by the output on the screen.
    2. Review the output file. Run the recommended caVerticaUtility.sh command.
  3. Run the CARE re.sh script on the Data Aggregator.
  4. Open a new support case referencing this KB article. Attach to the new case:
    • The vsql command output
    • The etlHealth.sh script output log
    • The caVerticaUtility.sh data package
    • The DA re.sh log package