Postgres Database in some Telco Cloud Automation (TCA) instances is constantly crashing and fails to start
# To check check the status of Postgres pods, login to TCA using SSH and run the following commands on shell.
export KUBECONFIG=/home/admin/.kube/config
kubectl get pods -n {namespace}
# namespace is tca-mgr for TCA Manager and tca-system for TCA Control Plane.
----------------------------------------------
For example:
[admin@tca-mgr~]$ kubectl get pods -n tca-mgr
NAME READY STATUS RESTARTS AGE
postgresql-ha-postgresql-0 0/1 CrashLoopBackOff 12 (2m9s ago) 24d # postgres pod is in crashing state
This will cause TCA authentication to fail on TCA Manager if customer is using Active Directory based authentication. TCA UI will throw a error saying "Invalid user name or password"
# error thrown by TCA when trying to login.
/hybridity/api/sessions
{
"timestamp" : "2023-05-01T20:19:25.210+00:00",
"status" : 401,
"error" : "Unauthorized",
"message" : "com.vmware.tca.keycloak.KeycloakRestClientException: [GET_ACCESS_TOKEN_FAILED]: Failed to login user. realm: orchestrationAD, username: *****, password: *****. Error: Connect to localhost:8180 [localhost/127.0.0.1] failed: Connection refused (Connection refused)",
"path" : "/hybridity/api/sessions"
}
2.2.0
This happens if secrets of Postgres service have been replaced or if there are older docker images of Postgres present.
Resolved in TCA 2.3
For existing TCA 2.2.0 , please follow the below instructions:
bash +x forceRestartPostgres.sh
Note: This script is only applicable to TCA 2.2.0 and it can take up to 15 mins depending on the environment as it brings down Postgres, deletes older Postgres images, deletes the persistent volume claim of Postgres and then force starts Postgres.
# To check check the status of Postgres pods, login to TCA using SSH and run the following commands on shell.
export KUBECONFIG=/home/admin/.kube/config
kubectl get pods -n {namespace}
# namespace is tca-mgr for TCA Manager and tca-system for TCA Control Plane.
----------------------------------------------
For example:
[admin@tca-mgr~]$ kubectl get pods -n tca-system
NAME READY STATUS RESTARTS AGE
postgresql-ha-postgresql-0 1/1 Running 0 4d16h # postgres pod is in running state