Velero backups are failing with error "tls: failed to verify certificate: x509 :Certificate has expired or is not yet valid"
search cancel

Velero backups are failing with error "tls: failed to verify certificate: x509 :Certificate has expired or is not yet valid"

book

Article ID: 396029

calendar_today

Updated On:

Products

VMware Tanzu Mission Control VMware Tanzu Mission Control - SM VMware Tanzu Kubernetes Grid Management

Issue/Introduction

  • Velero scheduled backups are failing for clusters managed by TMC Data Protection.

velero backup get
                                           
NAME         STATUS      ERRORS   WARNINGS     CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
BackupName1   Failed        0        0          2025-04-19 12:50:58 +0530 MST   45m       default            <none>
BackupName1   Failed        0        0          2025-04-20 12:50:58 +0530 MST   45m       default            <none>

  • The describe of the failed backup showing that the checking if the backup already exists in the Target Backup Location failed with failure reason  "tls: failed to verify certificate: x509 :Certificate has expired or is not yet valid"

    # kubectl describe backup <backup-name> -n velero

    Name:         Failed-backup-name

    Namespace:    velero


    Failure Reason:        error checking if backup already exists in object storage: rpc error: code = Unknown desc = operation error S3: HeadObject, exceeded maximum number of attempts, 3, https response error StatusCode: 0, RequestID: , HostID: , request send failed, Head "https://{TARGET_LOCATION_URL}/{TARGET_BUCKET}/01DFC1T###########7D5P6Y7VJ/backups/Failed-backup-name/velero-backup.json": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-04-21T09:30:10Z is after 2025-04-18T19:59:39Z

  • The describe of the delete backup request shows that the checking if the backup already exists in the Target Backup Location failed with the same failure "tls: failed to verify certificate: x509: certificate has expired or is not yet valid"

    # kubectl describe deletebackuprequests -n velero  deletebackuprequests-Name

    Name:         deletebackuprequests-Name
    Namespace:    velero
    Labels:       velero.io/backup-name=backup-name
                  velero.io/backup-uid=55######-####-###-####-#########1f
    Annotations:  <none>
    API Version:  velero.io/v1
    Kind:         DeleteBackupRequest

    Status:
      Errors:
        error getting backup's volume snapshots: rpc error: code = Unknown desc = operation error S3: HeadObject, exceeded maximum number of attempts, 3, https response error StatusCode: 0, RequestID: , HostID: , request send failed, Head "https://(TARGET_LOCATION_URL}/{TARGET_BUCKET}/01H#############1E7VJ/backups/BACKUP_NAME-full-20250321233025/BACKUP_NAME-full-20250321233025-volumesnapshots.json.gz": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-04-22T04:19:07Z is after 2025-04-18T19:59:39Z
        error to connect backup repo: error to connect to storage: error retrieving storage config from bucket "TARGET_BUCKET": Get "https://(TARGET_LOCATION_URL}/{TARGET_BUCKET}/1H#############1E7VJ/kopia/{NameSpace}/.storageconfig": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-04-22T04:19:15Z is after 2025-04-18T19:59:39Z
  • Checking the TARGET_LOCATION certificate, confirms that  the storage location custom root certificate or CA certificate got expired.
  • Decoding the  Spec.ObjectStorage.CACert base64 encoded certificate for your  in the backupstoragelocations k8s object shows that the certificate got expired .

    - Get the backupstoragelocations in yaml formate

    # kubectl get backupstoragelocations.velero.io -n velero -o yaml

    "apiVersion": "velero.io/v1",
                "kind": "BackupStorageLocation",


    objectStorage": {
                        "bucket": "{TARGET_BUCKET}",
                        "caCert": "LS0tLS1CRUdJTiBDRVJUSUZJQ0F####################################Jd0xraVE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t",

    - Decode the CACert base64 encoded Certificate using the following command 


    # echo LS0tLS1CRUdJTiBDRVJUSUZJQ0F####################################Jd0xraVE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t | base64 -d | openssl x509 -noout -dates

    notBefore=Apr 18 19:59:39 2024 GMT
    notAfter=Apr 18 19:59:39 2025 GMT

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware Tanzu Mission Control (TMC)
VMware Tanzu Mission Control - Self Managed (TMC- SM)
VMware Tanzu Kubernetes Grid Management (TKGm)

Cause

The root cause is an expired custom CA certificate configured for the Backup Target Location that used by Velero. When checking backup metadata or connecting to the repository, the TLS handshake fails due to the expired certificate.

Resolution

  • Renew your Target Location CACert Certificate.
  • Update Your Target Location CACert Certificate in Tanzu Mission Control console .
    1. In the Tanzu Mission Control console, click Administration in the left navigation pane.
    2. On the Administration a page, click the Target Locations tab.
    3. Select the "3 dotes" near the name for you Target Location and Select  Edit.
    4. Decode the bas64 pem certificate located in the "Custom root/CA certificate" box to confirm if it it is expired and if it is the same old certificate exist in your cluster  backupstoragelocations.velero .
    5. If it is the same old certificate and it is expired please update the  "Custom root/CA certificate" box with the new certificate that your storage location uses a custom root certificate or CA certificate.
    6. Click SAVE 
  • Check if the backupstoragelocations k8s object under the velero namespace got update with the new certificate.

    - Get the backupstoragelocations in yaml formate

    # kubectl get backupstoragelocations.velero.io -n velero -o yaml

    "apiVersion": "velero.io/v1",
                "kind": "BackupStorageLocation",


    objectStorage": {
                        "bucket": "{TARGET_BUCKET}",
                        "caCert": "LS0tLS1CRUdJTiBDRVJUSUZJQ0F####################################Jd0xraVE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t",

    - Decode the CACert base64 encoded Certificate using the following command 


    # echo LS0tLS1CRUdJTiBDRVJUSUZJQ0F####################################Jd0xraVE9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0t | base64 -d | openssl x509 -noout -dates

    notBefore= <New Date> GMT
    notAfter= <New Date> GMT

  • If it didn't you may need to restart the "velero" pod and the TMC "data-protection" pod and check again.
  • Once you confirm that that CAcert got update then run a new backup and it should complete successfully .


    Note: If the velero pods keep getting restarted and showing OOMKilled event, please increase the memory limit using Velero backups show Failed status due to velero pod "OOMKilled" crash