- It is observed that both metadata-store and tap packages are in Reconcile failed state. By checking the detailed metadata-store package status, it's reporting timeout waiting for deployment/metadata-store-app.
$ tanzu package installed list -n tap-install | grep -v "Reconcile succeeded" NAME PACKAGE-NAME PACKAGE-VERSION STATUS metadata-store metadata-store.apps.#####.######.com 1.4.4 Reconcile failed: Error (see .status.usefulErrorMessage for details) tap tap.###.######.com 1.4.2 Reconcile failed: Error (see .status.usefulErrorMessage for details) $ kubectl -n tap-install get pkgi metadata-store -o yaml ...... status: ...... usefulErrorMessage: 'kapp: Error: Timed out waiting after 15m0s for resources: [deployment/metadata-store-app (apps/v1) namespace: metadata-store]'
- metadata-store-app POD is in CrashLoopBackOff status.
$ kubectl -n metadata-store get pods NAME READY STATUS RESTARTS AGE metadata-store-app-5c9d99f965-sg7zd 1/2 CrashLoopBackOff 3396 (3m55s ago) 12d metadata-store-db-0 1/1 Running 0 115d
- metadata-store-app container log reports error message "Failed to connect to database - certificate has expired or is not yet valid"
$ kubectl -n metadata-store logs metadata-store-app-5c9d99f965-sg7zd metadata-store-app {"level":"error","ts":"2023-07-10T12:17:15.063089442Z","logger":"MetadataStore.gorm","msg":"failed to initialize database, got error %v","hostname":"metadata-store-app-5c9d99f965-sg7zd","data0":"failed to connect to `host=metadata-store-db user=metadata-store-user database=metadata-store`: tls error (x509: certificate has expired or is not yet valid: current time 2023-07-10T12:17:15Z is after 2023-06-15T09:25:27Z)","error":"sql error"} {"level":"error","ts":"2023-07-10T12:17:15.063128637Z","logger":"MetadataStore","msg":"Failed to connect to database: failed to connect to `host=metadata-store-db user=metadata-store-user database=metadata-store`: tls error (x509: certificate has expired or is not yet valid: current time 2023-07-10T12:17:15Z is after 2023-06-15T09:25:27Z)","hostname":"metadata-store-app-5c9d99f965-sg7zd","error":"failed to connect to `host=metadata-store-db user=metadata-store-user database=metadata-store`: tls error (x509: certificate has expired or is not yet valid: current time 2023-07-10T12:17:15Z is after 2023-06-15T09:25:27Z)"}
- This is related to a known issue in current TAP versions (As of now, the latest TAP version currently is 1.6 and this version is still being affected) - cert-manager rotates the certificates, but the metadata-store and the PostgreSQL db are unaware of the change, and are using the old certificates. More details regarding the known issue can be found in this link .
- Current workaround is to try deleting metadata-store-app POD or metadata-store-db POD or both. This should get things back to normal.
- If deletion of above PODs is not able to fully resolve the issue, get the logs for above two PODs and engage VMware Tanzu Support team for further investigation.