TMC Self-Managed authentication-server pods in CrashLoopBackOff due to expired tmcsm-issuer CA cert
search cancel

TMC Self-Managed authentication-server pods in CrashLoopBackOff due to expired tmcsm-issuer CA cert

book

Article ID: 379298

calendar_today

Updated On:

Products

VMware Tanzu Mission Control - SM VMware Tanzu Mission Control Self-Managed VMware Tanzu Mission Control Tanzu Mission Control

Issue/Introduction

Logging into the TMC Self-Managed UI could give the below error during the login process because of an expired tmcsm-issuer CA certificate.

{"message":"Could not exchange authorization code"}

The authentication-server pods in tmc-local namespace could be in CrashLoopBackOff state, and its logs could show a TLS error around an expired certificate.

{"component":"gaz-client","error":"tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-09-30T20:45:15Z is after 2024-09-09T01:09:40Z","http.host":"pinniped-supervisor.example.com","http.request.length_bytes":0,"http.url.path":"/provider/pinniped/.well-known/openid-configuration","level":"warning","msg":"request failed to execute, see err","span.kind":"client","system":"http","time":"2024-09-30T20:45:15Z"}
error: could not initialize tokenreview service: could not initialize OIDC provider: Get "https://pinniped-supervisor.example.com/provider/pinniped/.well-known/openid-configuration": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2024-09-30T20:45:15Z is after 2024-09-09T01:09:40Z

 

Environment

Tanzu Mission Control Self-Managed

Cause

The clusterissuer CA cert that is being used by the TMC pods has expired.  To confirm this, check the tls-ca-bundles configmap in the tmc-local namespace.

$ kubectl -n tmc-local get cm tls-ca-bundles -o jsonpath='{.data.letsencrypt\.pem}' > letsencrypt.pem

Then, check the expiration date of the tmcsm CA from that bundle by using the following command:

$ while openssl x509 -noout -dates -subject ; do :; echo "--------------------------"; done < letsencrypt.pem 2>/dev/null

Example output:

ubuntu@jumpbox:~$ while openssl x509 -noout -dates -subject ; do :; echo "--------------------------"; done < letsencrypt.pem 2>/dev/null
notBefore=Jun  8 20:01:17 2024 GMT
notAfter=Sep  8 20:01:17 2024 GMT
subject=CN = tmcsm
--------------------------
notBefore=Apr 19 18:58:26 2024 GMT
notAfter=Apr 19 18:58:26 2025 GMT
subject=C = US, ST = California, O = VMware, OU = MAPBU Support, CN = harbor.example.com
--------------------------
ubuntu@jumpbox:~$

The output will list all the trusted CA's.  From the output, check the validity of the "tmcsm" certificate.  If it has expired, then it needs to be renewed and the TMC values will also need to be updated with the new PEM information.

Note that during the initial TMC Self-Managed installation, the tmcsm-issuer ClusterIssuer cert is created in the cert-manager namespace.  This certificate is managed by cert-manager.  You can run the following commands to check this in cert-manager namespace.

$ kubectl -n cert-manager get clusterissuer,certificates,secrets
NAME                                              READY   AGE
clusterissuer.cert-manager.io/selfsigned-issuer   True    101d
clusterissuer.cert-manager.io/tmcsm-issuer        True    101d

NAME                                       READY   SECRET         AGE
certificate.cert-manager.io/tmcsm-issuer   True    tmcsm-issuer   101d

NAME                                 TYPE                             DATA   AGE
secret/cert-manager-registry-creds   kubernetes.io/dockerconfigjson   1      102d
secret/cert-manager-webhook-ca       Opaque                           3      102d
secret/tmcsm-issuer                  kubernetes.io/tls                3      101d

Since "tmcsm-issuer" certificate is managed by cert-manager, this certificate resource is expected to have been renewed already.  To confirm that the certificate is valid and not expired, you can run the following commands:

$ kubectl -n cert-manager get secret tmcsm-issuer -o jsonpath='{.data.ca\.crt}' | base64 -d > /tmp/tmcsm-ca.crt
$ openssl x509 -in /tmp/tmcsm-ca.crt -noout -dates -subject
notBefore=Aug  9 15:21:17 2024 GMT
notAfter=Nov  9 15:21:17 2024 GMT
subject=CN = tmcsm

However, the TMC tls-ca-bundles configmap sources the CA from the TMC values secret (which was sourced from a static values.yaml file during the initial installation).  The TMC values needs to be updated with the new CA PEM to resolve the issue.

Resolution

Follow these steps to update the TMC values yaml with the new valid tmcsm CA certificate.


    1. Retrieve the current tmcsm certificate PEM information from the secret.

      $ kubectl -n cert-manager get secret tmcsm-issuer -o jsonpath='{.data.ca\.crt}' | base64 -d > new-tmcsm-issuer-ca.crt


    2. Find the "tmc-values.yaml" file that was used during the initial installation.  Edit the file and replace the old CA with the new one retrieved in step 1.  This CA would be in the "trustedCAs" section.  If this file cannot be found any more, then this can be retrieved from the secret named "tanzu-mission-control-tmc-local-values".


    3. Run the "update" command using the tanzu CLI to update the tanzu-mission-control package with the updated tmc-values.yaml file.
      $ tanzu package installed update tanzu-mission-control -p tmc.tanzu.vmware.com --values-file tmc-values.yaml --namespace tmc-local
    4. Force the reconciliation of the TMC packages
      $ tanzu package installed kick tmc-local-stack-secrets -n tmc-local
      $ tanzu package installed kick tmc-local-stack -n tmc-local
      $ tanzu package installed kick tmc-local-support -n tmc-local
      $ tanzu package installed kick tanzu-mission-control -n tmc-local
    5. Restart all the deployments in tmc-local namespace
      $ for deployment in `kubectl -n tmc-local get deploy | grep -v NAME | awk '{print $1}'`; do echo $deployment; kubectl -n tmc-local rollout restart deploy $deployment ; done

 

 



 

Additional Information

NOTE: 

The tmcsm certificate renews every 90 days by default.  The cert-manager renews it automatically.  That means the resolution steps need to be done every 3 months.  This certificate resource can be updated so that the duration can be changed (e.g., 1 year validity).  If this is preferred, then follow these steps to changed the duration from the default of 90 days before starting the resolution steps.

  1. Run the following command to patch the Certificate resource (tmcsm-issuer) to have a duration of 1 year and renew the certificate as well
    $ kubectl -n cert-manager patch certificate tmcsm-issuer --type=merge -p '{"spec":{"duration":"8760h","renewBefore":"2159h00m00s"}}'
  2. Confirm that the certificate has been regenerated and has a duration of 1 year
    $ kubectl -n cert-manager get secret tmcsm-issuer -o jsonpath='{.data.ca\.crt}' | base64 -d > /tmp/tmcsm-ca.crt
    $ openssl x509 -in /tmp/tmcsm-ca.crt -noout -dates -subject
    notBefore=Oct  9 15:21:17 2024 GMT
    notAfter=Oct  9 15:21:17 2025 GMT
    subject=CN = tmcsm


  3. Run the following command to patch the Certificate resource to remove the 'renewBefore' setting that was added in step 1 to force the regeneration of the certificate
    $ kubectl -n cert-manager patch certificate tmcsm-issuer --type=json -p='[{"op": "remove", "path": "/spec/renewBefore"}]'