Postgres Database in some Telco Cloud Automation (TCA) instances is constantly crashing and fails to start
# To check check the status of Postgres pods, login to TCA using SSH and run the following commands on shell. export KUBECONFIG=/home/admin/.kube/config kubectl get pods -n {namespace} # namespace is tca-mgr for TCA Manager and tca-system for TCA Control Plane. ---------------------------------------------- For example: [admin@tca-mgr~]$ kubectl get pods -n tca-mgr NAME READY STATUS RESTARTS AGE postgresql-ha-postgresql-0 0/1 CrashLoopBackOff 12 (2m9s ago) 24d # postgres pod is in crashing state
This will cause TCA authentication to fail on TCA Manager if customer is using Active Directory based authentication. TCA UI will throw a error saying "Invalid user name or password"
# error thrown by TCA when trying to login. /hybridity/api/sessions { "timestamp" : "2023-05-01T20:19:25.210+00:00", "status" : 401, "error" : "Unauthorized", "message" : "com.vmware.tca.keycloak.KeycloakRestClientException: [GET_ACCESS_TOKEN_FAILED]: Failed to login user. realm: orchestrationAD, username: *****, password: *****. Error: Connect to localhost:8180 [localhost/127.0.0.1] failed: Connection refused (Connection refused)", "path" : "/hybridity/api/sessions" }
2.2.0
This happens if secrets of Postgres service have been replaced or if there are older docker images of Postgres present.
Resolved in TCA 2.3
For existing TCA 2.2.0 , please follow the below instructions:
bash +x forceRestartPostgres.sh
Note: This script is only applicable to TCA 2.2.0 and it can take up to 15 mins depending on the environment as it brings down Postgres, deletes older Postgres images, deletes the persistent volume claim of Postgres and then force starts Postgres.
# To check check the status of Postgres pods, login to TCA using SSH and run the following commands on shell. export KUBECONFIG=/home/admin/.kube/config kubectl get pods -n {namespace} # namespace is tca-mgr for TCA Manager and tca-system for TCA Control Plane. ---------------------------------------------- For example: [admin@tca-mgr~]$ kubectl get pods -n tca-system NAME READY STATUS RESTARTS AGE postgresql-ha-postgresql-0 1/1 Running 0 4d16h # postgres pod is in running state