ACC postgres db is sometimes partially initialized, so ACC Acc is not responding and working as expected.
Symptoms:
a) Agent Download Dialog not listing all the agent packages:
b) ACC pods are in CrashLoopBackOff or init status:
kubectl get pods -n<namespace> | grep acc
ng-acc-configserver-db-deploymnet - CrashLoopBackOff
ng-acc-configserver-deployment - CrashLoopBackOff
ng-acc-repository-deploymnet (init:01)
scaling them down and up doesn't help
c) When checking the ng-acc-configserver pod messages indicating that posgres database cannot be started.
kubectl logs <ng-acc-configserver-db-deployment-pod> -n<namespace>
Possible root causes:
- NFS service down
- ACC postgres database is corrupted
- NFS Out of disk space damaging some data files, it is not possible to recover the data files.
DX Application Performance Management 11.x, 20.x and onward releases
NOTE: in this example, <nfs-folder> is /nfs/ca/dxi
ng-acc-configserver-db-deployment
1. Scale down ng-acc-configserver-db-deployment
kubectl scale --replicas=0 deployment ng-acc-configserver-db-deployment -n<namespace>
Example:
kubectl scale --replicas=0 deployment ng-acc-configserver-db-deployment -ndxi
2. Backup existing ACC corrupted database (<nfs-dir>/acc/cs/db)
Examaple:
mkdir -p /backups/db-bkp
cp -rpf /nfs/ca/dxi/acc/cs/db to /backups/db-bkp
3. Scale up ng-acc-configserver-db-deployment
kubectl scale --replicas=1 deployment ng-acc-configserver-db-deployment -ndxi
4. Verify that pod is up
kubectl get pods -ndxi | grep ng-acc-configserver-db-deployment
ng-acc-configserver-db-deployment-5df89ccbb5-vcvp5 0/1 Running 0 15s
5. Verify that database started successfully:
kubectl logs <ng-acc-configserver-db-deployment-pod> -n<namespace>
ng-acc-configserver-deployment
1. Scale down and up ng-acc-configserver-deployment
kubectl scale --replicas=0 deployment ng-acc-configserver-deployment -n<namespace>
kubectl scale --replicas=1 deployment ng-acc-configserver-deployment -n<namespace>
Example
kubectl scale --replicas=0 deployment ng-acc-configserver-deployment -ndxi
kubectl scale --replicas=1 deployment ng-acc-configserver-deployment -ndxi
2. Verify that pod is up
kubectl get pods -ndxi | grep ng-acc-configserver-deployment
ng-acc-configserver-deployment-7dc9c5b879-wtlrd 0/1 Running 0 51s
ng-acc-repository-deployment
1. Scale down ng-acc-repository-deployment
kubectl scale --replicas=0 deployment ng-acc-repository-deployment -n<namespace>
Example:
kubectl scale --replicas=0 deployment ng-acc-repository-deployment -ndxi
2. backup corrupted acc repository:
cd <nfs-dir>/acc/cs/repository
mv <nfs-dir>/acc/cs/repository repository.bck
mkdir <nfs-dir>/acc/cs/repository
chmod -R 1010:1010 repository
3. Scale up ng-acc-repository-deployment
kubectl scale --replicas=1 deployment ng-acc-repository-deployment -n<namespace>
Example:
kubectl scale --replicas=1 deployment ng-acc-repository-deployment -ndxi
4. Verify that pod is up
kubectl get pods -ndxi | grep ng-acc-repository-deployment
ng-acc-repository-deployment-ff6457c7f-mxjpx 2/2 Running 0 22s
5. You can proceed to create a new tenant, you should be able to access ACC UI
IMPORTANT: If you have already created Tenants apply below steps in addition:
Recover a specific tenant after ACC DB deletion
First 4 steps describe how to obtain values necessary for a REST API call in step 5.
1. Find value of ACC management token
- In Kubernetes, DXI namespace, click on Config & Storage / Secrets menu item.
-Click on item ng-acc-configserver-secret
-Click on an eye icon next to "token" secret.
-An ACC management token is displayed like this:
token: 81bacd65-9874-49ee-98b3-7a312ebfd792
Remember this value as ACC_MANAGEMENT_TOKEN for use in step 5.
2. Find hostname of EM container of the tenant
- In Kubernetes, DXI namespace, click on Discovery and Load Balancing / Services menu item.
- Use filter icon to search for "apm-em-10" where the 10 is tenant service id of the tenant.
- The item to look for looks like "apm-em-10-958963" where 10 is tenant service id of the tenant and other 6 digit number is random assigned during tenant creation. Click on it.
- Copy value in name field.
This is the hostname of the EM container of the tenant.
Remember this value as EM_HOSTNAME for use in step 5.
3. Find EM-ACC integration token
- Continue from step 2, click on the Pod that is in the Pods section of the service.
- Detail view of the Pod shows in section Containers / Environment variables an environment variable ACC_TOKEN with a value like this:
ACC_TOKEN: 9403e0e2-8ea7-40b9-8f22-dd230706bbc8
This is an EM-ACC integration token.
Remember this value as EM_ACC_INTEGRATION_TOKEN for use in step 5
4. Find TENANT_ID and TENANT_NAME for use in step 5
Easiest way to get tenant id and tenant name is by using dximanager UI which shows up when logging in as "masteradmin" tenant and "masteradmin" account.
There is a Tenant icon on the left side. Click on it, it shows entries for all tenants. The text in "Tenant ID" column, here "test001", is TENANT_NAME. A tooltip shows with mouse-over the element with TENANT_ID, here 10.
5. Register a tenant in ACC
- In Kubernetes, DXI namespace, click on Workloads / Deployments
- Use filter icon and search for "ng-acc-configserver-deployment", click on the displayed item.
- Click on item in a New Replica Set, then click on item in Pods section.
- Click on Exec text/icon in the header of displayed Pod. Prompt should now show the shell is in “APMCommandCenterServer” directory.
- Prepare a command for execution in a plain text editor:
curl -v -X POST -H 'Authorization: Bearer ACC_MANAGEMENT_TOKEN' -H 'Content-Type: application/json' -d '
{
"internalId" : TENANT_ID,
"externalId" : "TENANT_NAME",
"emUrl" : "http://EM_HOSTNAME:8081/",
"integrationUserToken" : "EM_ACC_INTEGRATION_TOKEN"
}
' http://localhost:8088/apm/appmap/acc/apm/acc/tenant
Fill in actual remembered values for colored placeholders.
- Copy the command from the editor. Paste the command into Shell using Shift-Insert key.
- Verify expected status code is HTTP/1.1 201 Created
.
6. Validate
- Login to the tenant. ATC UI should appear.
- Click on APM Command Center link in dropdown menu next “ALL MY UNIVERSES” at the top right.
- ACC UI should be displayed without red ribbon that contains error message at the top.
- If ACC bundles have not been re-imported, the Bundles menu will show 0 bundles. If you want to re-import bundles, perform steps like in Step#5, prepare and execute following commands:
rm -rf repository/com temp/*
curl -v -X POST -H 'Authorization: Bearer ACC_MANAGEMENT_TOKEN' -H 'Content-Type: application/json' -d '{}' http://localhost:8088/apm/appmap/acc/apm/acc/bundle/refresh
Second curl command may take a few minutes to complete and returns status code HTTP/1.1 204 No Content