How to renew the TKGI validator secret certificate included in the pks-system namespace
search cancel

How to renew the TKGI validator secret certificate included in the pks-system namespace

book

Article ID: 335085

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid

Issue/Introduction

Symptoms:

The procedure outlines the steps that you can use to rotate the TKGI certificates of the secrets event-controller, fluent-bit, validator  and their CA  pks-ca .The procedure in this article can be used for TKGI version 1.11.x and above.
 


 


Resolution

A fix to rotate the certs of event-controller/ fluent-bit/ validator/pks-ca will be done automatically when the customers upgrade the TKGI tile to 1.16.1 or 1.15.5 or 1.14.7 then upgrade the TKGI clusters afterwards
https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid-Integrated-Edition/1.14/tkgi/GUID-release-notes.html#1-14-0-secrets-not-rotated

 

 


Workaround:

1- Please take a backup first  of the secrets that you will rotate (event-controller/ fluent-bit/ validator/pks-ca) using the below commands 
 

kubectl get secrets -n pks-system event-controller > event-controller.yaml
cp event-controller.yaml event-controller-backup.yaml

kubectl get secrets -n pks-system fluent-bit > fluent-bit.yaml 
cp fluent-bit.yaml fluent-bit-backup.yaml

kubectl get secrets -n pks-system validator > validator.yaml 
cp validator.yaml validator-backup.yaml


2- To rotate the  event-controller/ fluent-bit/ validator certs , please delete the event-controller, fluent-bit and  validator secrets. If you want to rotate pks-ca kindly delete  the secret pks-ca as well. 

3-  Apply the cert-generator job to generate new ca and certs.

To apply the job please take a backup of the job as yaml , delete the job  and then apply the backup  yaml .
 

Note : You will need to edit the cert-generator backup YAML and remove the 2 lines selector.matchLabels.controller-uid and spec.template.metadata.labels.controller-uid to get to apply the job successfully using the below kubectl command 

kubectl apply -f cert-generator-backup.yaml
4- Restart event-controller/ fluent-bit/ validator and validate that their certs are showing a new expiration date  by running the below commands.


$ kubectl get secrets event-controller  -o json -n pks-system  | jq -r '.data."tls.crt"' | base64 -d | openssl x509 -text | grep 'Before\|After'
            Not Before: Mar 25 09:45:00 2020 GMT
            Not After : Mar 25 09:45:00 2023 GMT
$ kubectl get secrets pks-ca  -o json -n pks-system  | jq -r '.data."tls.crt"' | base64 -d | openssl x509 -text | grep 'Before\|After'
            Not Before: Mar 25 09:45:00 2020 GMT
            Not After : Mar 24 09:45:00 2025 GMT
$ kubectl get secrets fluent-bit  -o json -n pks-system  | jq -r '.data."tls.crt"' | base64 -d | openssl x509 -text | grep 'Before\|After'
            Not Before: Mar 25 09:45:00 2020 GMT
            Not After : Mar 25 09:45:00 2023 GMT
$ kubectl get secrets validator -o json -n pks-system  | jq -r '.data."tls.crt"' | base64 -d | openssl x509 -text | grep 'Before\|After'
            Not Before: Mar 25 09:45:00 2020 GMT
            Not After : Mar 25 09:45:00 2023 GMT
  
 
 


Additional Information

Impact/Risks:

If the cert of validator is invalid, a new applied logsink/clusterLogsink/metricsink/clusterMetricsink after the expiration date will be failed with error message:x509: certificate has expired or is not yet valid

e.g.

ubuntu@opsmanager-2-10:~$ k apply -f logsink.yaml
Error from server (InternalError): error when creating "logsink.yaml": Internal error occurred: failed calling webhook "log.validator.pksapi.io": failed to call webhook: Post "https://validator.pks-system.svc:443/logsink?timeout=10s": x509: certificate has expired or is not yet valid: current time 2023-04-04T06:08:45Z is after 2023-04-04T04:58:00Z