Rotate TKG classy cluster’s TKG_PROXY_CA_CERT certificates
search cancel

Rotate TKG classy cluster’s TKG_PROXY_CA_CERT certificates

book

Article ID: 313135

calendar_today

Updated On: 11-10-2023

Products

VMware Tanzu Kubernetes Grid

Issue/Introduction

Limitations
This certificate rotation procedure is only supported when the cluster is still in a healthy state and the certificate hasn’t expired. Since if the cluster is not in a good state, and the certificate expired, the node might have issues coming up since containerd can’t pull the image.

We don’t have docs to rotate expired certificates for now.

 


Symptoms:

Currently TKG support those parameters to configure external certificates in classy clusters:

  • TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE
  • TKG_PROXY_CA_CERT
  • CUSTOM_TDNF_REPOSITORY_CERTIFICATE
  • ADDITIONAL_IMAGE_REGISTRY_1_CA_CERTIFICATE(TKG 2.2)
  • ADDITIONAL_IMAGE_REGISTRY_2_CA_CERTIFICATE(TKG 2.2)
  • ADDITIONAL_IMAGE_REGISTRY_3_CA_CERTIFICATE(TKG 2.2)

This document contains steps on how to rotate TKG_PROXY_CA_CERT.

Since these are external certificates, the most straightforward solution is the traditional 3-steps certificate rotation method. In general:

  1. The customer updates the cluster variable with old and new certificates encoded in a single string. CAPI starts to roll out the cluster which gets the update.
  2. The customer updates the registry or proxy he/she wants to update the certificate - proxy, repository, tdnf repository - with new certificates.
  3. The customer updates the cluster variable with only a new certificate. CAPI starts to roll out the cluster which gets the update.


Environment

VMware Tanzu Kubernetes Grid 2.2.0
VMware Tanzu Kubernetes Grid 2.1.1
VMware Tanzu Kubernetes Grid 2.4.0
VMware Tanzu Kubernetes Grid 2.3.0
VMware Tanzu Kubernetes Grid 2.1.0

Resolution

Phase 1

  1. Update management cluster
    1. Updates `cluster.spec.topology.variables.trust.additionalTrustedCAs[0].name[proxy].data` with both new and old certificates base64 encoded together
      1. For example, put below into a single file,
-----BEGIN CERTIFICATE-----
Old
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
New
-----END CERTIFICATE-----
 
  1. Encode the file with base64. This base64 value will be also used later
$ cat <certifcate-file-name> | base64 -w0
## use below string to represent the encoded value in following examples
AAAAB3NzaC1yc2EAAAADAQABAAAA
 
  1. Update the management cluster object
kubectl edit <cluster-name> -n <namespace>
  
  1. This will trigger cluster rollout, it might take a while. Kapp controller will pick up correct certificates from the cluster’s annotations, so no need to modify its configuration. Monitor cluster rollout
kubectl get nodes


 
  1. Updates tkr’s secret Tkr’s secret, tkr-source-controller-values.caCerts with both new and old certificates together
  1. Get the secret content by using 
kubectl get secret -n tkg-system tkr-source-controller-values  -ojsonpath='{.data.values\.yaml}' | base64 -d > tkr-source-controller-values-content
 
  1. Modify the tkr-source-controller-values-content ‘caCerts’ field with both new and old certificates encoded value, generated in step 1.ii, for example:
caCerts: AAAAB3NzaC1yc2EAAAADAQABAAAA
 
  1. Encode the content of tkr-source-controller-values-content
cat tkr-source-controller-values-content | base64 -w0
 
  1. Put the encoded content back to the tkr-source-controller-values  secret
kubectl edit secret tkr-source-controller-values -n tkg-system
## edit .data.values.yaml
 
  1. Restart tkr-source-controller, for example
kubectl delete pod -n tkg-system  --selector=app=tkr-source-controller
 
 
  1. Updates tkg-pkg with base64 encoded old and new certificates, generated in step 1.ii
    1. First get the content and save to a file, for example tkg-pkg-values
kubectl get secret -n tkg-system tkg-pkg-tkg-system-values -ojsonpath='{.data.tkgpackagevalues\.yaml}' | base64 -d > tkg-pkg-values
 
  1. Update the content tkrSourceControllerPackage.tkrSourceControllerPackageValues.caCerts and configvalues.TKG_PROXY_CA_CERT with encoded base64 certificates
tkrSourceControllerPackage:
    tkrSourceControllerPackageValues:
    caCerts: AAAAB3NzaC1yc2EAAAADAQABAAAA
configvalues:
  TKG_PROXY_CA_CERT: AAAAB3NzaC1yc2EAAAADAQABAAAA
 
  1. Encode the file again with base64 command
cat tkg-pkg-values | base64 -w0
 
  1. Update the secret tkg-pkg-tkg-system-values with new value generated above
kubectl edit secret -n tkg-system tkg-pkg-tkg-system-values
## edit .data.tkgpackagevalues.yaml
 
  1. If the customer is also using pinniped and passing certificate of proxy to pinniped secret upstream_oidc_tls_ca_data, then also need to update pinniped secret, save the content into a file, and update
kubectl get secret -n tkg-system <management-cluster-name>-pinniped-package -ojsonpath='{.data.values\.yaml}' | base64 -d > pinniped-package-values
 
  1. Update pinniped.upstream_oidc_tls_ca_data with baser64 encoded old and new certificates
pinniped:
  upstream_oidc_tls_ca_data: AAAAB3NzaC1yc2EAAAADAQABAAAA
 
  1. Update the secret
kubectl edit secret -n tkg-system <management-cluster-name>-pinniped-package
## edit .data.values.yaml
 
  1. Update workload cluster
    1. Repeat step 1.a, except targeting the workload cluster’s object
 

Phase 2

Update the certificate on the registry server that the customer manage

Phase 3

Redo the phase1, except, this time only encodes the new certificate, no need to use the old certificate



Verification
  1. Cluster could rollout in Phase 3
  2. Make sure no Error inside tkr-source-controller and kapp-controller, kapp-controller is deployed on both workload cluster and management cluster
  3. If pinniped uses proxy pod, make sure no error inside pinniped supervisor pod, and tanzu auth could work