TKGm Certificate Rotation - Control Plane Nodes
search cancel

TKGm Certificate Rotation - Control Plane Nodes

book

Article ID: 418484

calendar_today

Updated On:

Products

Tanzu Kubernetes Runtime

Issue/Introduction

Overview

If a Kubernetes cluster is operated for more than one year without being upgraded, the internal certificates used by the cluster will expire.
This KB explains the manual certificate rotation procedure for Control Plane (CP) nodes, applicable to both TKGm Management Clusters and Workload Clusters.

  • Starting in TKGm v2.5.x, Control Plane node certificate auto-renewal is supported
    • Under normal operations, you should not need to follow this KB procedures
    • If the target Workload Cluster is Plan-based (Legacy), certificate auto-renewal is not supported, and manual rotation is required in this KB
  • This KB has been verified for both ClusterClass-based clusters and Plan-based clusters.

Out of Scope

  • Worker node (kubelet) certificates are automatically renewed by K8s, so they're not covered in this KB
  • TCA (Telco Cloud Automation)
    • Do not apply the procedures in this KB against TCA environments
    • Instead, open a case with the TCA Support Team
  • VKS (vSphere Kubernetes Service)
    • Do not apply the procedures in this KB against VKS environments
    • It was also called "vSphere with Tanzu" and "TKGS"
    • Refer to the KB

Symptoms

If the certificates have already expired, the kube-apiserver returns the following error when you attempt to run kubectl commands.

kubectl get nodes
#> Unable to connect to the server: x509: certificate has expired or is not yet valid

How to verify the certificate expiration dates on CP nodes.

# Log in to all 3 CP nodes, one by one.
ssh capv@${KCP_IPADDR}
sudo -i

kubeadm certs check-expiration
#> CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY
#> admin.conf                 Jan 09, 2025 05:03 UTC   364d            ca
#> apiserver                  Jan 09, 2025 05:03 UTC   364d            ca
#> apiserver-etcd-client      Jan 09, 2025 05:03 UTC   364d            etcd-ca
#> apiserver-kubelet-client   Jan 09, 2025 05:03 UTC   364d            ca
#> controller-manager.conf    Jan 09, 2025 05:03 UTC   364d            ca
#> etcd-healthcheck-client    Jan 09, 2025 05:03 UTC   364d            etcd-ca
#> etcd-peer                  Jan 09, 2025 05:03 UTC   364d            etcd-ca
#> etcd-server                Jan 09, 2025 05:03 UTC   364d            etcd-ca
#> front-proxy-client         Jan 09, 2025 05:03 UTC   364d            front-proxy-ca
#> scheduler.conf             Jan 09, 2025 05:03 UTC   364d            ca
#>
#> CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME
#> ca                      Jan 07, 2034 05:02 UTC   9y
#> etcd-ca                 Jan 07, 2034 05:02 UTC   9y
#> front-proxy-ca          Jan 07, 2034 05:02 UTC   9y

 

Environment

  • TKGm v2.1.x
  • TKGm v2.2.x
  • TKGm v2.3.x
  • TKGm v2.4.x
  • TKGm v2.5.x

Resolution

1. Switch to the target cluster context

tanzu cluster list -A --include-management-cluster
kubectl config get-contexts
kubectl config use-context <TARGET_CONTEXT>

2. SSH login to the target control-plane node (CP node)

# Check the CP node IP address
kubectl get nodes -owide
KCP=192.168.x.x
ssh capv@${KCP}
sudo -i

# Check the current cert status
kubeadm certs check-expiration

3. Rotate the CP node certificates

kubeadm certs renew all

# Restart the pods
crictl stop $(crictl ps --name kube-apiserver -q)
crictl stop $(crictl ps --name kube-controller-manager -q)
crictl stop $(crictl ps --name kube-scheduler -q)
crictl stop $(crictl ps --name etcd -q)
crictl stop $(crictl ps --name kube-vip -q) # optional when using kube-vip

# Check
kubeadm certs check-expiration

4. Repeat the same procedure on each Control Plane node.

5. After rotating the 3 CP nodes certificates, update the client certificates used by the Tanzu CLI and kubectl in the jumpbox.

KB - TKGm Certificate Rotation - Tanzu CLI and kubectl

Additional Information