metadata-store-app is in CrashLoopBackOff state with error message "Failed to connect to database - certificate has expired or is not yet valid" in the log
search cancel

metadata-store-app is in CrashLoopBackOff state with error message "Failed to connect to database - certificate has expired or is not yet valid" in the log

book

Article ID: 297905

calendar_today

Updated On:

Products

VMware Tanzu Application Platform

Issue/Introduction

- It is observed that both metadata-store and tap packages are in Reconcile failed state. By checking the detailed metadata-store package status, it's reporting timeout waiting for deployment/metadata-store-app.

$ tanzu package installed list -n tap-install | grep -v "Reconcile succeeded"
  NAME               PACKAGE-NAME                             PACKAGE-VERSION  STATUS
  metadata-store     metadata-store.apps.#####.######.com     1.4.4            Reconcile failed: Error (see .status.usefulErrorMessage for details)
  tap                tap.###.######.com                     1.4.2            Reconcile failed: Error (see .status.usefulErrorMessage for details)
  
$ kubectl -n tap-install get pkgi metadata-store -o yaml
......
status:
......
  usefulErrorMessage: 'kapp: Error: Timed out waiting after 15m0s for resources: [deployment/metadata-store-app
    (apps/v1) namespace: metadata-store]'

 - metadata-store-app POD is in CrashLoopBackOff status.

$ kubectl -n metadata-store get pods
NAME                                  READY   STATUS             RESTARTS           AGE
metadata-store-app-5c9d99f965-sg7zd   1/2     CrashLoopBackOff   3396 (3m55s ago)   12d
metadata-store-db-0                   1/1     Running            0                  115d

 - metadata-store-app container log reports error message "Failed to connect to database - certificate has expired or is not yet valid"

$ kubectl -n metadata-store logs metadata-store-app-5c9d99f965-sg7zd metadata-store-app
{"level":"error","ts":"2023-07-10T12:17:15.063089442Z","logger":"MetadataStore.gorm","msg":"failed to initialize database, got error %v","hostname":"metadata-store-app-5c9d99f965-sg7zd","data0":"failed to connect to `host=metadata-store-db user=metadata-store-user database=metadata-store`: tls error (x509: certificate has expired or is not yet valid: current time 2023-07-10T12:17:15Z is after 2023-06-15T09:25:27Z)","error":"sql error"}
{"level":"error","ts":"2023-07-10T12:17:15.063128637Z","logger":"MetadataStore","msg":"Failed to connect to database: failed to connect to `host=metadata-store-db user=metadata-store-user database=metadata-store`: tls error (x509: certificate has expired or is not yet valid: current time 2023-07-10T12:17:15Z is after 2023-06-15T09:25:27Z)","hostname":"metadata-store-app-5c9d99f965-sg7zd","error":"failed to connect to `host=metadata-store-db user=metadata-store-user database=metadata-store`: tls error (x509: certificate has expired or is not yet valid: current time 2023-07-10T12:17:15Z is after 2023-06-15T09:25:27Z)"}



Environment

Product Version: 1.5

Resolution

- This is related to a known issue in current TAP versions (As of now, the latest TAP version currently is 1.6 and this version is still being affected) - cert-manager rotates the certificates, but the metadata-store and the PostgreSQL db are unaware of the change, and are using the old certificates. More details regarding the known issue can be found in this link .
 - Current workaround is to try deleting metadata-store-app POD or metadata-store-db POD or both. This should get things back to normal.
- If deletion of above PODs is not able to fully resolve the issue, get the logs for above two PODs and engage VMware Tanzu Support team for further investigation.