Rotate TKG classy cluster’s TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE certificates
search cancel

Rotate TKG classy cluster’s TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE certificates

book

Article ID: 313137

calendar_today

Updated On:

Products

VMware

Issue/Introduction

Limitations
This certificate rotation procedure is only supported when the cluster is still in a healthy state and the certificate hasn’t expired. Since if the cluster is not in a good state, and the certificate expired, the node might have issues coming up since containerd can’t pull the image.

We don’t have docs to rotate expired certificates for now.



Symptoms:

Currently TKG support those parameters to configure external certificates in classy clusters:

  • TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE
  • TKG_PROXY_CA_CERT
  • CUSTOM_TDNF_REPOSITORY_CERTIFICATE
  • ADDITIONAL_IMAGE_REGISTRY_1_CA_CERTIFICATE(TKG 2.2)
  • ADDITIONAL_IMAGE_REGISTRY_2_CA_CERTIFICATE(TKG 2.2)
  • ADDITIONAL_IMAGE_REGISTRY_3_CA_CERTIFICATE(TKG 2.2)


This document contains steps on how to rotate TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE.

Since these are external certificates, the most straightforward solution is the traditional 3-steps certificate rotation method. In general:

  1. The customer updates the cluster variable with old and new certificates encoded in a single string. CAPI starts to roll out the cluster which gets the update.
  2. The customer updates the registry or proxy he/she wants to update the certificate - proxy, repository, tdnf repository - with new certificates.
  3. The customer updates the cluster variable with only a new certificate. CAPI starts to roll out the cluster which gets the update.


Environment

VMware Tanzu Kubernetes Grid 2.1.1
VMware Tanzu Kubernetes Grid 2.2.0
VMware Tanzu Kubernetes Grid 2.3.0
VMware Tanzu Kubernetes Grid 2.4.0
VMware Tanzu Kubernetes Grid 2.1.0

Resolution

Phase1

  1. Update management cluster
    1. Updates `cluster.spec.topology.variables.trust.additionalTrustedCAs[0].name[‘imageRepository’].data` with both new and old certificates base64 encoded together
      1. For example, put below into a single file,
-----BEGIN CERTIFICATE-----
Old
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
New
-----END CERTIFICATE-----
 
  1. Encode the file with base64. This base64 value will be also used later
$ cat <certifcate-file-name> | base64 -w0
## use below string to represent the encoded value in following examples
AAAAB3NzaC1yc2EAAAADAQABAAAA
  1. Update the management cluster object
kubectl edit <cluster-name> -n <namespace>
 
  1. This will trigger cluster rollout, it might take a while. 
Kapp controller will pick up correct certificates from the cluster’s annotations, so no need to modify its configuration.
Monitor cluster rollout
kubectl get nodes

 
 
  1. Updates tkr’s secret, tkr-source-controller-values.caCerts with both new and old certificates together
    1. Get the secret content by using
kubectl get secret -n tkg-system tkr-source-controller-values  -ojsonpath='{.data.values\.yaml}' | base64 -d > tkr-source-controller-values-content
  1. Modify the tkr-source-controller-values-content ‘caCerts’ field with both new and old certificates encoded value, generated in step a.ii, for example:
caCerts: AAAAB3NzaC1yc2EAAAADAQABAAAA
  1. Encode the content of tkr-source-controller-values-content
cat tkr-source-controller-values-content | base64 -w0
  1. Put the encoded content back to the tkr-source-controller-values secret
kubectl edit secret tkr-source-controller-values -n tkg-system
## edit .data.values.yaml
  1. Restart tkr-source-controller, for example
kubectl delete pod -n tkg-system --selector=app=tkr-source-controller
 
  1. Updates tkg-pkg with base64 encoded old and new certificates, generated in step a.ii
    1. First get the content and save to a file, for example tkg-pkg-values
kubectl get secret -n tkg-system tkg-pkg-tkg-system-values -ojsonpath='{.data.tkgpackagevalues\.yaml}' | base64 -d > tkg-pkg-values
  1. Update the content tkrSourceControllerPackage.tkrSourceControllerPackageValues.caCerts and configvalues.TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE with encoded base64 certificates
tkrSourceControllerPackage:
    tkrSourceControllerPackageValues:
    caCerts: AAAAB3NzaC1yc2EAAAADAQABAAAA
configvalues:
  TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: AAAAB3NzaC1yc2EAAAADAQABAAAA
  1. Encode the file again with base64 command
cat tkg-pkg-values | base64 -w0

Update the secret tkg-pkg-tkg-system-values with new value generated above
kubectl edit secret -n tkg-system tkg-pkg-tkg-system-values
## edit .data.tkgpackagevalues.yaml
 
  1. Update workload cluster
  1. Repeat step 1.a, except targeting the workload cluster’s object


Phase 2
Update the certificate on the registry server that the customer manage


Phase 3
Redo the phase 1, except, this time only encodes the new certificate, no need to use the old certificate.


Verification
  1. Cluster could rollout in Phase 3
  2. Make sure no Error inside tkr-source-controller and kapp-controller