Data Collector dcmd service fails to remain started
search cancel

Data Collector dcmd service fails to remain started

book

Article ID: 226186

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

Cannot start dcmd service on data collector after ungraceful termination when the DX Netops Performance Management Data Collector runs as non-root

After the restart, the data directory and karaf.out are owned by root and the only process that starts is the IcmpDaemon.

Running status checks of the dcmd service using systemctl shows an active running service, but only the IcmpDaemon services remains running. It should also have the Data Collector apache-karaf server service running.

Environment

All DX NetOps Performance Management releases 21.2.7 and older

Cause

When Data Aggregator or Data Collectors running as a non-root sudo user are shut down ungracefully (default paths shown):

  • The /opt/IMData*/apache-karaf-<version>/data and /opt/IMData*/apache-karaf-<version>/deploy directories are moved to data.bak and deploy.bak directories.
  • New directories to replace them are created as the root user.

Result is sudo install owner unable to write to those directories causing failure during the startup of the dcmd or dadaemon services.

Resolution

This was resolved by engineering via defect DE517827. The fix is included in the The fix for this issue is included in DX Netops Performance Management releases 21.2.8 and newer.

The fix is referenced in the DX Netops Performance Management 21.2.8 Fixed Issues List

The entry for the problem states:

  • Symptom: When the data aggregator or the data collector run as a non-root user, if they shut down ungracefully, they move the data and the deploy directories to the .bak file. They then create new data and deploy as root. Subsequently, they cannot write to these directories, and fail to start.
  • Resolution: With this fix, the startup script creates the data and the deploy directories, and then changes the ownership of the directories (chown) to the user running the data aggregator or the data collector. The data aggregator or data collector can now write to these directories on startup.
  • (21.2.8, DE517827, 32881977, 32911183)

Additional Information

  1. NOTE: Always stop dcmd BEFORE activemq when stopping the services and when running as non-root, start activemq before dcmd.
  2. To get around the issue using the following workaround for versions 21.2.7 and earlier. (default paths shown)
    1. Confirm the /opt/IMDataCollector/apache-karaf-4.2.6/data directory exists.
      1. If there is no data directory create it first using the command:
        1. mkdir /opt/IMDataCollector/apache-karaf-4.2.6/data
    2. Change ownership for the data directory by running the following command:
      1. chown -R dcuser:dcuser /opt/IMDataCollector/apache-karaf-4.2.6/data
      2. Replace dcuser with your DC install owner user.
    3. Is the activemq service running?
      1. Check using "systemctl status activemq"
      2. If down start using "systemctl start activemq"
    4. Is there an active dcmd service?
      1. Check using "systemctl status dcmd"
      2. If active stop using "systemctl stop dcmd"
      3. If inactive, or once stopped, start using "systemctl start dcmd"
    5. Confirm via "systemctl status dcmd" command that both IcmpDaemon and apache karaf server services run and remain running.