Fault Tolerant Data Aggregator failover to secondary fails
search cancel

Fault Tolerant Data Aggregator failover to secondary fails

book

Article ID: 252904

calendar_today

Updated On:

Products

Network Observability CA Performance Management

Issue/Introduction

Stopped the primary DA in a FT DA pair. The secondary DA never started. It failed back to primary.

DataAggregator fails to start on new CA Performance Management installation with error : "kahadb is locked by another server"

After installing new a new DA, it will not start up correctly. We see this for the dadaemon service when using the systemctl command.

[admin@da01 ~]$ systemctl status dadaemon
● dadaemon.service - Data Aggregator
   Loaded: loaded (/etc/systemd/system/dadaemon.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

[admin@da01 ~]$ sudo systemctl start dadaemon
Job for dadaemon.service failed because the control process exited with error code.
See "systemctl status dadaemon.service" and "journalctl -xe" for details.

The following is shown in the (default path shown) /opt/IMDataAggregator/broker/apache-activemq-5.18.6/data/activemq.log file.

2022-10-18 09:55:03,683 | INFO  | Database /opt/da-shared/broker/kahadb/lock is locked by another server. This broker is now in slave mode waiting a lock to be acquired | org.apache.activemq.store.SharedFileLocker | main

Environment

All supported Network Observability DX NetOps Performance Management Fault Tolerant Data Aggregator releases

Cause

Issue caused by install user not having full Read/Write permissions for the shared disk upon which the DA is installed

Resolution

Ensure the Data Aggregator Install and process owner user is able to access the shared data directory from both Data Aggregator systems. The user must also be able to read and write files in that location.

Additional Information

Unsure where the shared data directory used by the DA's is located? It is defined in the /etc/DA.cfg in the da.data.home variable value.