After a reinstall DX Platform everything seems to working as expected except ACC, pods remain in CrashLoopBackOff and Init status.
ng-acc-configserver-db-deployment-854d85f7cb-wxq6m 1/1 Running 1 36m
ng-acc-configserver-deployment-654b587dd9-xqmvf 0/1 CrashLoopBackOff 10 29m
ng-acc-repository-deployment-67d4c577db-hqvhn 0/2 Init:0/1 0 27m
We tried to recreate the acc-configserver-db as described in KB article but the problem persists.
DX APM - ng-acc-configserver pods in CrashLoopBackOff due to ACC postgres DB corruption
https://knowledge.broadcom.com/external/article?articleId=144154
The database is up and running fine, pod ng-acc-configserver-deployment log throws below error message:
org.postgresql.util.PSQLException: FATAL: password authentication failed for user "<user>"
at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:525)
at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:146)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:197)
DX Application Performance Management 20.2
ACC postgres security setting affecting ACC pods communication
How to verify the condition?
1) Try to logon to acc database not using localhost but using the service url hostname “ng-acc-configserver-db-service”
kubectl exec -ti ng-acc-configserver-db-deployment-pod> sh -n<namespace>
psql -W -h ng-acc-configserver-db-service -U <username> <password>
2) Check ACC postgres log
cd <nfs-dir>/acc/cs/db-logs/
ls -ltr
tail -f postgresql-<timestamp>.log
Found message: User "user" does not have a valid SCRAM verifier.
3) Check postgres configuration:
kubecl exec -ti ng-acc-configserver-db-deployment-pod> sh -n<namespace>
psql -W -h ng-acc-configserver-db-service -U <username> <password>
cat /var/lib/postgresql/data/pg_hba.conf
1) connect to posgres db pod
get pods -n<namespace> | grep ng-acc-configserver-db-deployment
kubecl exec -ti <ng-acc-configserver-db-deployment-pod> sh -ndxi
2) Check you can connect to postgres DB
psql -U <username> <password>
to exit: \q
3) Edit pg_hba.conf
vi /var/lib/postgresql/data/pg_hba.conf
comment the scram-sha-256 line:
#host all all all scram-sha-256
4) Restart ng-acc-configserver pods
kubectl scale --replicas=0 deployment/ng-acc-configserver-db-deployment -n<namespace>
kubectl scale --replicas=1 deployment/ng-acc-configserver-db-deployment -n<namespace>
kubectl scale --replicas=0 deployment/ng-acc-configserver-deployment -n<namespace>
kubectl scale --replicas=1 deployment/ng-acc-configserver-deployment -n<namespace>
5) Check apmccsrv log and postgres
a) postgres logs:
cd <nfs-dir>/acc/cs/db-logs/
ls -ltr (to get latest logs at the end of the list)
tail -f postgresql-<timestamp>.log
b) apmccsrv log
cd <nfs-dir>/acc/cs/logs/
ls -ltr
tail -f apmccsrv-<pod>.log
DX AIOPs - Troubleshooting, Common Issues and Best Practices
https://knowledge.broadcom.com/external/article/190815