Manually renew cluster certificates
search cancel

Manually renew cluster certificates

book

Article ID: 314178

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

This article is to be used to renew (expired) certificates of management and workload clusters which are deployed and managed by Telco Cloud Automation (TCA).

Starting with TCA 2.3, TCA supports automatic certificate renewal for both management cluster and v2 workload clusters. Refer to Create a v2 Workload Cluster Template for more details.



Environment

2.3

Cause

In some cases, it may be required to manually renew the cluster certificates and/or kubeconfig stored in the TCA database.

Resolution

TCA 2.3 automatically handles the regeneration of the Kubernetes cluster certificates.

Workaround:
Checking if Cluster Certificates are valid:
Run the following commands to query the Kubernetes Cluster Control Plane node for the status and expiration date of Kubernetes Cluster certificates:

  1. SSH to the Control Plane node of the CaaS Cluster and switch to the sudo user:
ssh capv@K8S-CONTROL-PLANE-IP
sudo -i
  1. Check if certificates are expired:
kubeadm certs check-expiration


Another verification point is to login to each Telco Cloud Automation-Control Plane (TCA-CP) Appliance Management UI and looking at the status of the Kubernetes Clusters registered within TCA-CP:

A green dot would mean that the communication is fine and there is no certificate related issue.
A red dot would mean that the communication is broken and there could be a possible certificate related issue.


Updating the Cluster certificate within CaaS
TCA 2.3 introduces the automatic renewal of the Cluster Certificates, given the CaaS Clusters have been upgraded to the supported Tanzu Kubernetes Grid (TKG) clusters: 1.24.10, 1.23.16 and 1.22.17.

Clusters upgraded to TKG 1.24.10, 1.23.16 or 1.22.17:
TCA 2.3 will automatically handle the regeneration of the Kubernetes cluster certificates. However, these clusters will still need their kubeconfig updated. Please proceed to the Updating the references of new Cluster Certificates within TCA-M and TCA-CP section.

NOTE: The following steps are applicable only for TKG clusters that were deployed via older versions of TCA and have NOT been upgraded in TCA 2.3.

Renewing the Workload Cluster Certificate

  1. SSH into the TCA-CP where corresponding management cluster is deployed as the admin user and switch to the sudo user:
ssh admin@mgmt_cluster_tca_cp_fqdn
su –
Note: Replace mgmt_cluster_tca_cp_fqdn with the actual values in the command provided.
  1. Download the attached certificate renewal tool file: cluster-cert-renew.tar.gz.
    Note: For Airgap environments, manually SCP the file over to the TCA-CP:
  1. Untar the cluster-cert-renew.tar.gz tar ball:
tar -zxvf cluster-cert-renew.tar.gz 
  1. Renew the workload cluster certificate:
cd /home/admin/cluster-cert-renew
bash cert-renew -wc workload-cluster-name -mc mgmt-cluster-name -t workload

Note: Replace workload-cluster-name and mgmt-cluster-name with the actual values in the command provided. 
  1. Verify the new workload cluster certificate has been stored on management cluster:
kubectl config use-context mgmt-cluster-name-admin@mgmt-cluster-name
kubectl get secret workloadcluster-name-kubeconfig -n workloadcluster-name -ojsonpath='{.data.value}' | base64 -d

Note: Replace mgmt-cluster-name-admin and mgmt-cluster-name and workloadcluster-name with the actual values in the command provided.


Renew the Management Cluster Certificates

  1. SSH into the TCA-CP and switch to the sudo user:
ssh admin@tca_cp
su –
Note: Replace tca_cp  with the IP of the TCA-CP where the management cluster is configured in the command provided. 
  1. Download the cluster-cert-renew scripts tar to TCA-CP:
curl -kfsSL https://vmwaresaas.jfrog.io/artifactory/generic-registry/kb/20230413/cluster-cert-renew.tar.gz --output cluster-cert-renew.tar.gz 
  1. Untar the cluster-cert-renew.tar.gz tar ball and change to the cluster-cert-renew directory:
tar -zxvf cluster-cert-renew.tar.gz
cd /home/admin/cluster-cert-renew 
  1. Obtain the Control Plane node IP:
Note: This control-plane-node-ip is different from the static cluster kube-vip IP
    1. SSH to the management cluster:
ssh capv@mgmt-kube-vip
Note: Replace mgmt-kube-vip with the actual value in the command provided.
    1. Run the kubectl get nodes command:
kubectl get nodes -owide | grep control-plane | awk '{print ""$6""}' | head -n 1
  1. Renew the management cluster certificate. 
bash cert-renew -mc mgmt-cluster-name -t management -ip control-plane-node-ip
Note: Replace control-plane-node-ip with the control-plane-node-ip from the previous step.
Note: This can take several minutes to complete.


Synchronize the kubeconfig for the TCA-Manager (TCA-M) and TCA-CP
Note: All (upgraded and non-upgraded) Clusters require the kubeconfig to be synchronized. The steps 1 through 4 should only be applied to TCA-M and not TCA-CP.

  1. POST the following API call, from any machine that has access to the TCA-M web layer, to generate an authentication token:
curl -D - --location --insecure --request POST 'https://tca-m-url/hybridity/api/sessions' --header 'Accept: application/json' --header 'Content-Type: text/plain' --data-raw '{"username": "username","password": "plain_text_password"}'
Note: Replace tca-m-url and username and plain_text_password
with the actual values in the command provided. 
  1. Take note of the x-hm-authorization from the output of the previous step:
Sample: 95XXXXX4:dXX2:4XX3:bXX2:7XXXXXXXXXX5
  1. Update the TCA-M and TCA-CP database by synchronizing the kubeconfig:
curl --location --insecure --request POST 'https://tca-m-fqdn/telco/api/caas/v2/clusters/cluster_name/syncKubeconfig' --header 'Accept: application/json' --header 'Content-Type: application/json' --header 'x-hm-authorization: auth-token'

Note: Replace tca-m-fqdn and cluster_name and auth-token with the actual values in the command provided.
Note: The operation can take several minutes. 
  1. To ensure that the operation is succeeded, run the following API call:
curl --location --insecure --request GET 'https://tca-m-fqdn/hybridity/api/jobs/job_id_from_above_response' --header 'Accept: application/json' --header 'x-hm-authorization: auth-token'

Note: Replace tca-m-fqdn, auth-token and job_id_from_above_response with the actual values in the command provided.

Note: Take note of the isDone and didFail flags in the json. The isDone flag should return true and the didFail flag should return false.
 
  1. SSH login to TCA-CP to restart the services:
ssh admin@tca-cp
su -
Note: tca-cp where the cluster is configured
  1. Restart the following TCA-CP services:
systemctl restart app-engine
systemctl restart web-engine

Please Note: In case of multiple TCA-CPs (i.e one for Mgmt cluster & one for Workload cluster) the app & web services should be started from both.


Additional Information

Impact/Risks: Clusters deployed by TCA 2.X and currently managed by TCA 2.3.

The automatic renewal of TKG cluster certificates can fail for various reasons, one of which is described below.

For the "Control Plane Node Certificate Auto-Renewal" feature to function correctly, new nodes must be rolled out. If any node becomes stuck in a provisioning state, the renewal process will be unsuccessful.

During certificate renewal, control plane nodes are replaced sequentially, with each new node joining the cluster only after the preceding one is fully operational, ensuring the etcd quorum remains intact.

Certificate renewal may also fail due to issues with etcd or if a control plane node is unable to complete provisioning.



Attachments

cluster-cert-renew.tar.gz get_app