After upgrade of CA Performance Management (CAPM) - Data Aggregator - System Status shows Failed for Fault-Tolerant (FT) Data Aggregator
search cancel

After upgrade of CA Performance Management (CAPM) - Data Aggregator - System Status shows Failed for Fault-Tolerant (FT) Data Aggregator

book

Article ID: 215566

calendar_today

Updated On:

Products

CA Performance Management - Usage and Administration DX NetOps

Issue/Introduction

After upgrading, under system status in the CAPC console, only one DA appears or neither of the Fault-Tolerant pair do.

Environment

Dx NetOps Performance Management 20.2 or later

Cause

We have introduced an acl token in 20.2.10 and that token must be shared between the proxy and both DA’s for the proxy consul service to be able to communicate.

Resolution

  1. Stop both DA’s and consul services. Try the maintenance command first


    /opt/IMDataAggregator/scripts/dadaemon maintenance;

    If this doesn’t stop the DA after 5 minutes, use systemctl

    systemctl stop dadaemon;
    systemctl stop activemq;


    And then stop the consul services  

    systemctl stop consul;
    systemctl stop consul-ext;

  2. On the proxy host, stop the da-proxy and consul services

    systemctl stop consul;

    systemctl stop daproxy; 

    Then start the daproxy service;

    systemctl start daproxy;

  3. Rename the consul data directory on all three hosts (DA’s and proxy). Default location on DA is:

    /opt/IMDataAggregator/consul/data


    So

    mv /opt/IMDataAggregator/consul/data /opt/IMDataAggregator/consul/data.old


    The default location on the Proxy is:  

    /opt/CA/daproxy/data

    So run;

    mv /opt/CA/daproxy/data /opt/CA/daproxy/data.old

     

  4. Start consul service on all three hosts

    systemctl start consul   

  5. Confirm a leader has been selected by consul by running the following on both DA's and proxy: 

    curl http://127.0.0.1:8500/v1/status/leader


    It should return the proxy host. If it returns "", there is no leader. So you should do "systemctl status consul" on DA proxy and DAs to see if a consul is having issues.

    Resolve the issue/error, and restart consul and check leader URL.

  6. If ACL was enabled and bootstrapped, the old acl master token will become invalid once all the data directories are deleted, so will need to bootstrap ACL again. We will be making an HTTP PUT request to one of the DA consul servers 

    curl -v -X PUT -H 'Content-Type: application/xml' http://DA-HOST:8500/v1/acl/bootstrap


    This request will then return a SecretId, which will be the new consul acl master token.

  7. Update the acl-token.properties file in the DA shared repo with this new token. Change directory to your sharedrepo, this will be named whatever you chose during the install. Then;

    cp acl-token.properties acl-token.properties.original

    vi acl-token.properties

    Replace the old token with the new token from the SecretId value in the PUT command results.

  8. Start the consul-ext service on both DA's

    systemctl start consul-ext

     

  9. Verify that you can now see all members on all nodes

    /opt/IMDataAggregator/consul/bin/consul kv get -recurse -token=cb37db60-0088-70bd-092c-36f2e507c406 (replace with your token)