Symptoms:
a) ACC pods are in CrashLoopBackOff or init status:
kubectl get pods -ndxi | grep acc
ng-acc-configserver-db-deploymnet - CrashLoopBackOff
ng-acc-configserver-deployment - CrashLoopBackOff
ng-acc-repository-deploymnet (init:01)
scaling them down and up doesn't help
b) When checking the ng-acc-configserver pod messages indicating that posgres database cannot be started.
kubectl logs <ng-acc-configserver-db-deployment-pod> -ndxi
ACC pods are unable to initialize and start because ACC postgres database is corrupted
In this case, the file-system of the NFS server ran out of disk space damaging some data files
It is not possible to recover the data files.
APM 11.x, 19.x
In this example, NFS-DXI-Folder is located in /data/nfs/ca/dxi
ng-acc-configserver-db-deployment
1. Login to Kubernetes console, select DXI namespace, click on Workloads / Deployments
Scale down ng-acc-configserver-db-deployment
2. Backup and move postgres ACC corrupted database:
mv /data/nfs/ca/dxi/acc/cs/db to db-bkp
3. Click on Workloads / Deployments
Scale up ng-acc-configserver-db-deployment
4. Verify that pod is up
5. Verify that database startup up successfully:
Obtain the pod for ng-acc-configserver-db-deployment
kubectl get pods -ndxi | grep acc
Check the log
kubectl logs <ng-acc-configserver-db-deployment-pod> -ndxi
ng-acc-configserver-deployment
1. In Kubernetes, DXI namespace, click on Workloads / Deployments
Scale down and up ng-acc-configserver-deployment
2. Verify that pod is up
ng-acc-repository-deployment
1. In Kubernetes, DXI namespace, click on Workloads / Deployments
Scale down ng-acc-repository-deployment
2. backup corrupted acc repository:
mv /data/nfs/ca/dxi/acc/cs/repository repository.bck
mkdir /data/nfs/ca/dxi/acc/cs/repository
chmod -R 1010:1010 repository
3. click on Workloads / Deployments
Scale up ng-acc-repository-deployment
4. Verify that pod is up
IMPORTANT: If you have already created some Tenants apply below steps otherwise you can proceed to create a new tenant, you should be able to access ACC UI
Recover a specific tenant after ACC DB deletion
First 4 steps describe how to obtain values necessary for a REST API call in step 5.
1. Find value of ACC management token
- In Kubernetes, DXI namespace, click on Config & Storage / Secrets menu item.
-Click on item ng-acc-configserver-secret
-Click on an eye icon next to "token" secret.
-An ACC management token is displayed like this:
token: 81bacd65-9874-49ee-98b3-7a312ebfd792
Remember this value as ACC_MANAGEMENT_TOKEN for use in step 5.
2. Find hostname of EM container of the tenant
- In Kubernetes, DXI namespace, click on Discovery and Load Balancing / Services menu item.
- Use filter icon to search for "apm-em-10" where the 10 is tenant service id of the tenant.
- The item to look for looks like "apm-em-10-958963" where 10 is tenant service id of the tenant and other 6 digit number is random assigned during tenant creation. Click on it.
- Copy value in name field.
This is the hostname of the EM container of the tenant.
Remember this value as EM_HOSTNAME for use in step 5.
3. Find EM-ACC integration token
- Continue from step 2, click on the Pod that is in the Pods section of the service.
- Detail view of the Pod shows in section Containers / Environment variables an environment variable ACC_TOKEN with a value like this:
ACC_TOKEN: 9403e0e2-8ea7-40b9-8f22-dd230706bbc8
This is an EM-ACC integration token.
Remember this value as EM_ACC_INTEGRATION_TOKEN for use in step 5
4. Find TENANT_ID and TENANT_NAME for use in step 5
Easiest way to get tenant id and tenant name is by using dximanager UI which shows up when logging in as "masteradmin" tenant and "masteradmin" account.
There is a Tenant icon on the left side. Click on it, it shows entries for all tenants. The text in "Tenant ID" column, here "test001", is TENANT_NAME. A tooltip shows with mouse-over the element with TENANT_ID, here 10.
5. Register a tenant in ACC
- In Kubernetes, DXI namespace, click on Workloads / Deployments
- Use filter icon and search for "ng-acc-configserver-deployment", click on the displayed item.
- Click on item in a New Replica Set, then click on item in Pods section.
- Click on Exec text/icon in the header of displayed Pod. Prompt should now show the shell is in “APMCommandCenterServer” directory.
- Prepare a command for execution in a plain text editor:
curl -v -X POST -H 'Authorization: Bearer ACC_MANAGEMENT_TOKEN' -H 'Content-Type: application/json' -d '
{
"internalId" : TENANT_ID,
"externalId" : "TENANT_NAME",
"emUrl" : "http://EM_HOSTNAME:8081/",
"integrationUserToken" : "EM_ACC_INTEGRATION_TOKEN"
}
' http://localhost:8088/apm/appmap/acc/apm/acc/tenant
Fill in actual remembered values for colored placeholders.
- Copy the command from the editor. Paste the command into Shell using Shift-Insert key.
- Verify expected status code is HTTP/1.1 201 Created
.
6. Validate
- Login to the tenant. ATC UI should appear.
- Click on APM Command Center link in dropdown menu next “ALL MY UNIVERSES” at the top right.
- ACC UI should be displayed without red ribbon that contains error message at the top.
- If ACC bundles have not been re-imported, the Bundles menu will show 0 bundles. If you want to re-import bundles, perform steps like in Step#5, prepare and execute following commands:
rm -rf repository/com temp/*
curl -v -X POST -H 'Authorization: Bearer ACC_MANAGEMENT_TOKEN' -H 'Content-Type: application/json' -d '{}' http://localhost:8088/apm/appmap/acc/apm/acc/bundle/refresh
Second curl command may take a few minutes to complete and returns status code HTTP/1.1 204 No Content