APM ACC pods in a Crash Loop Back-off
search cancel

APM ACC pods in a Crash Loop Back-off

book

Article ID: 250470

calendar_today

Updated On:

Products

CA Application Performance Management (APM / Wily / Introscope)

Issue/Introduction

We have noticed that ng-acc pods are not starting:

 - ng-acc-repository-deployment in Pod initializing
 - ng-acc-configserver-db-deployment and ng-acc-configserver-deployment Crash Loop Back-off).

the below error is reported in the ng-acc-configserver-db-deployment pod log

"PostgreSQL Database directory appears to contain a database; Skipping initialization

[1]: [1] FATAL:  lock file "postmaster.pid" is empty
 [1]: [2] HINT:  Either another server is starting, or the lock file is the remnant of a previous server startup crash."

Environment

APM 21.3.1

Cause

postmaster.pid corruption

A diskspace issue affected the update of the <nfs>/acc/cs/db/postmaster.pid

To verify this condition:

1) Go to <nfs>/acc/cs/db/

2) Check the size of postmaster.pid

Resolution

1) scale down ng-acc-configserver-db-deployment

2) cd <nfs>/acc/cs/db/

delete postmaster.pid

3) scale down and up ng-acc-configserver-deployment

4) scale down and up ng-acc-repository-deployment

5) verify ng-acc pods are up and running

 

Additional verification:

Connect to pod ng-acc-configserver-db-deployment

Run below sql queries to confirm that you can see your tenants information and agent bundles for each of them:

psql -U acc
select * from tenant;
select tenant_id, count(*) from bundle group by tenant_id;

Example:

 

Additional Information

https://knowledge.broadcom.com/external/article/190815/aiops-troubleshooting-common-issues-and.html