DX APM - ng-acc-configserver pods in CrashLoopBackOff, Agent Download Dialog not listing all the agent packages

book

Article ID: 144154

calendar_today

Updated On:

Products

DX Application Performance Management

Issue/Introduction

ACC postgres db is sometimes partially initialized, so ACC Acc is not responding and working as expected.


Symptoms:

a) Agent Download Dialog not listing all the agent packages:


b) ACC pods are in CrashLoopBackOff or init status:

kubectl get pods -n<namespace> | grep acc

ng-acc-configserver-db-deploymnet - CrashLoopBackOff
ng-acc-configserver-deployment - CrashLoopBackOff
ng-acc-repository-deploymnet  (init:01)

scaling them down and up doesn't help


c) When checking the ng-acc-configserver pod messages indicating that posgres database cannot be started.

kubectl logs <ng-acc-configserver-db-deployment-pod> -n<namespace>

Cause

Possible root causes:

- NFS service down

- ACC postgres database is corrupted

- NFS Out of disk space damaging some data files, it is not possible to recover the data files. 

Environment

DX Application Performance Management 11.x, 20.x and onward releases

Resolution

NOTE: in this example, <nfs-folder> is /nfs/ca/dxi

ng-acc-configserver-db-deployment

1. Scale down ng-acc-configserver-db-deployment

kubectl scale --replicas=0 deployment ng-acc-configserver-db-deployment -n<namespace>

Example:

kubectl scale --replicas=0 deployment ng-acc-configserver-db-deployment -ndxi


2. Backup existing ACC corrupted database (<nfs-dir>/acc/cs/db)

Examaple:

mkdir -p /backups/db-bkp
cp -rpf /nfs/ca/dxi/acc/cs/db to /backups/db-bkp

3. Scale up ng-acc-configserver-db-deployment

kubectl scale --replicas=1 deployment ng-acc-configserver-db-deployment -ndxi

 

4. Verify that pod is up

kubectl get pods -ndxi | grep ng-acc-configserver-db-deployment
ng-acc-configserver-db-deployment-5df89ccbb5-vcvp5    0/1       Running     0          15s

5. Verify that database started successfully:

kubectl logs <ng-acc-configserver-db-deployment-pod> -n<namespace>

 

 

ng-acc-configserver-deployment

1. Scale down and up ng-acc-configserver-deployment

kubectl scale --replicas=0 deployment ng-acc-configserver-deployment -n<namespace>

kubectl scale --replicas=1 deployment ng-acc-configserver-deployment -n<namespace>

Example

kubectl scale --replicas=0 deployment ng-acc-configserver-deployment -ndxi

kubectl scale --replicas=1 deployment ng-acc-configserver-deployment -ndxi

 

2. Verify that pod is up

kubectl get pods -ndxi | grep ng-acc-configserver-deployment
ng-acc-configserver-deployment-7dc9c5b879-wtlrd       0/1       Running     0          51s

 

ng-acc-repository-deployment

1. Scale down ng-acc-repository-deployment

kubectl scale --replicas=0 deployment ng-acc-repository-deployment -n<namespace>

Example:

kubectl scale --replicas=0 deployment ng-acc-repository-deployment -ndxi

 

2. backup corrupted acc repository: 

cd <nfs-dir>/acc/cs/repository

mv <nfs-dir>/acc/cs/repository repository.bck 

mkdir <nfs-dir>/acc/cs/repository

chmod -R 1010:1010 repository


3. Scale up ng-acc-repository-deployment

kubectl scale --replicas=1 deployment ng-acc-repository-deployment -n<namespace>

Example:

kubectl scale --replicas=1 deployment ng-acc-repository-deployment -ndxi

 

4. Verify that pod is up

kubectl get pods -ndxi | grep ng-acc-repository-deployment
ng-acc-repository-deployment-ff6457c7f-mxjpx          2/2       Running     0          22s

 

5. You can proceed to create a new tenant, you should be able to access ACC UI

 

IMPORTANT: If you have already created Tenants apply below steps in addition:


Recover a specific tenant after ACC DB deletion

First 4 steps describe how to obtain values necessary for a REST API call in step 5.

1. Find value of ACC management token

- In Kubernetes, DXI namespace, click on Config & Storage / Secrets menu item.
-Click on item ng-acc-configserver-secret
-Click on an eye icon next to "token" secret.
-An ACC management token is displayed like this:
    token: 81bacd65-9874-49ee-98b3-7a312ebfd792

Remember this value as ACC_MANAGEMENT_TOKEN for use in step 5.


2. Find hostname of EM container of the tenant

- In Kubernetes, DXI namespace, click on Discovery and Load Balancing / Services menu item.
- Use filter icon to search for "apm-em-10" where the 10 is tenant service id of the tenant.
- The item to look for looks like "apm-em-10-958963" where 10 is tenant service id of the tenant and other 6 digit number is random assigned during tenant creation. Click on it.
- Copy value in name field. 
This is the hostname of the EM container of the tenant. 

Remember this value as EM_HOSTNAME for use in step 5.


3. Find EM-ACC integration token

- Continue from step 2, click on the Pod that is in the Pods section of the service.
- Detail view of the Pod shows in section Containers / Environment variables an environment variable ACC_TOKEN with a value like this:
    ACC_TOKEN: 9403e0e2-8ea7-40b9-8f22-dd230706bbc8
This is an EM-ACC integration token. 

Remember this value as EM_ACC_INTEGRATION_TOKEN for use in step 5


4. Find TENANT_ID and TENANT_NAME for use in step 5

Easiest way to get tenant id and tenant name is by using dximanager UI which shows up when logging in as "masteradmin" tenant and "masteradmin" account.
There is a Tenant icon on the left side. Click on it, it shows entries for all tenants. The text in "Tenant ID" column, here "test001", is TENANT_NAME. A tooltip shows with mouse-over the element with TENANT_ID, here 10.

5. Register a tenant in ACC

- In Kubernetes, DXI namespace, click on Workloads / Deployments
- Use filter icon and search for "ng-acc-configserver-deployment", click on the displayed item.
- Click on item in a New Replica Set, then click on item in Pods section.
- Click on Exec text/icon in the header of displayed Pod. Prompt should now show the shell is in “APMCommandCenterServer” directory.
- Prepare a command for execution in a plain text editor:
   
curl -v -X POST  -H 'Authorization: Bearer ACC_MANAGEMENT_TOKEN' -H 'Content-Type: application/json' -d '
{
  "internalId" : TENANT_ID,
  "externalId" : "TENANT_NAME",
  "emUrl" : "http://EM_HOSTNAME:8081/",
  "integrationUserToken" : "EM_ACC_INTEGRATION_TOKEN"
}
' http://localhost:8088/apm/appmap/acc/apm/acc/tenant 

Fill in actual remembered values for colored placeholders.

- Copy the command from the editor. Paste the command into Shell using Shift-Insert key.
- Verify expected status code is HTTP/1.1 201 Created
.

6. Validate

- Login to the tenant. ATC UI should appear.
- Click on APM Command Center link in dropdown menu next “ALL MY UNIVERSES” at the top right.
- ACC UI should be displayed without red ribbon that contains error message at the top.
- If ACC bundles have not been re-imported, the Bundles menu will show 0 bundles. If you want to re-import bundles, perform steps like in Step#5, prepare and execute following commands:

rm -rf repository/com temp/*
curl -v -X POST  -H 'Authorization: Bearer ACC_MANAGEMENT_TOKEN' -H 'Content-Type: application/json' -d '{}' http://localhost:8088/apm/appmap/acc/apm/acc/bundle/refresh

Second curl command may take a few minutes to complete and returns status code HTTP/1.1 204 No Content

Attachments