The Data Aggregator dadaemon service fails to start or stay running

book

Article ID: 223578

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

After upgrading DX NetOps Performance Management the Data Aggregator dadaemon service fails to start or stay running.

The Data Aggregator is installed using the root user, but is owned and run by a non-root sudo user named dauser.

The status command for the service (systemctl status dadaemon) shows it fails to start and suggests the "journalctl -xe" command be run.

That command shows permissions errors creating or writing to various required files.

-- Unit dadaemon.service has begun starting up.
...
Sep 09 14:16:01 data-agg_host dadaemon[4537]: .lock file found, backing up data & deploy, clearing karaf cache
Sep 09 14:16:01 data-agg_host dadaemon[4537]: Starting IM Data Aggregator
...
Sep 09 14:16:01 data-agg_host dadaemon[4537]: start: Redirecting Karaf output to /opt/IMDataAggregator/apache-karaf-4.2.6/data/karaf.out
...
Sep 09 14:16:01 data-agg_host dadaemon[4537]: /opt/IMDataAggregator/apache-karaf-4.2.6/bin/start: line 95: /opt/IMDataAggregator/apache-karaf-4.2.6/data/karaf.out: Permission denied
Sep 09 14:16:01 data-agg_host dadaemon[4625]: Stopping IM Data Aggregator.
...
Sep 09 14:16:02 data-agg_host dadaemon[4625]: mkdir: cannot create directory '/opt/IMDataAggregator/apache-karaf-4.2.6/data/log': Permission denied
Sep 09 14:16:02 data-agg_host dadaemon[4625]: mkdir: cannot create directory '/opt/IMDataAggregator/apache-karaf-4.2.6/data/tmp': Permission denied
Sep 09 14:16:02 data-agg_host dadaemon[4625]: OpenJDK 64-Bit Server VM warning: Ignoring option UnsyncloadClass; support was removed in 11.0
Sep 09 14:16:02 data-agg_host dadaemon[4625]: /opt/IMDataAggregator/apache-karaf-4.2.6/data/port shutdown port file doesn't exist. The container is not running.
...
Sep 09 14:16:02 data-agg_host dadaemon[4625]: Error stopping the Data Aggregator, error code=3
Sep 09 14:16:02 data-agg_host systemd[1]: dadaemon.service: control process exited, code=exited status=3
Sep 09 14:16:02 data-agg_host systemd[1]: Failed to start Data Aggregator.

When checking ownership the data directory under (default path) /opt/IMDataAggregator/apache-karaf-<version> is owned by root, not the dauser.

When checking the /etc/systemd/system located dadaemon.service and activemq.service files neither has the correct User variable to show the dauser as the service owner.

Cause

The upgrade was run while the root owned cron job to restart the dadaemon every minute was enabled. During the upgrade it tried to start the dadaemon and broke the installation.

Environment

All supported DX NetOps Performance Management releases

Resolution

The following steps were used to resolve this.

  1. Disable the cronjob that restarts the DA. Log in as the user that owns it. Run "crontab -e" to edit it. Add a "#" comment symbol as the first character of the line. Save the changes.
  2. Stop dadaemon and activemq services.
    • systemctl stop dadaemon
    • systemctl stop activemq
  3. Run the following chown command from the (default path shown) /opt/IMDataAggregator directory:
    • chown -R userName .
    • Replace username with the DA install owner user defined in your environment. If mine is dauser I'd run:
      • chown -R dauser .
    • If the user name was broadcom I'd run:
      • chown -R broadcom .
    • Note the period at the end after userName. Ensure it's included.
  4. Run the following commands (default paths shown). They will remove and recreate the service files.
    1. Uninstall the files.
      • /opt/IMDataAggregator/scripts/dadaemon uninstall
      • /opt/IMDataAggregator/scripts/activemq uninstall
    2. Install the files.
      • /opt/IMDataAggregator/scripts/dadaemon install
      • /opt/IMDataAggregator/scripts/activemq install
    3. Review each file post install.
      • Confirm all values present are correct.
      • Confirm the User= variable is present and set with the correct user name.
  5. Start the DA dadaemon service. It should start activemq.
    • systemctl start dadaemon
  6. Confirm services are running:
    • systemctl status activemq
    •  systemctl status dadaemon
  7. If the activemq service didn't start for any reason but dadaemon did, simply start the activemq service.
    • systemctl start activemq