TKG management cluster not able to connect to vCenter after vCenter cert or thumbprint change
book
Article ID: 337406
calendar_today
Updated On:
Products
VMware Tanzu Kubernetes GridTanzu Kubernetes Grid
Issue/Introduction
Symptoms:
capv-controller-manager pod logs on the management cluster show similar errors to below:
E1220 17:33:48.452774 1 controller.go:257] controller-runtime/controller "msg"="Reconciler error" "error"="unexpected error while probing vcenter for infrastructure.cluster.x-k8s.io/v1alpha3, Kind=VSphereCluster uploader-prod/uploader-autoscale: Post \"https://VCENTER-FQDN/sdk\": host \"VCENTER-FQDN:443\" thumbprint does not match \"8C:23:7D:32:D7:E5:5B:90:30:54:49:9C:76:EB:1C:37:69:FA:AA:C1\"" "controller"="vspherecluster" "name"="uploader-autoscale" "namespace"="uploader-prod"
You are unable to create or scale any workload clusters
Environment
VMware Tanzu Kubernetes Grid v1.5 to v2.5
Cause
This error occurs due to the vCenter certificate thumbprint changing and not being updated on the TKG management cluster objects and/or in the workload cluster object's metadata.
Resolution
This workaround is not valid for TKG stretch clusters deployed by Telco Cloud Automation. Please contact VMware support for assistance in updating TCA stretch cluster deployments.
For TKGm v2.3 and above, use the "tanzu mc credentials update" command to update the thumbprint in the Management Cluster and its Workload Clusters. See the steps in Update Cluster Credentials for more info.
For TKGm v2.1 and v2.2, use the following procedure.
Update the workload cluster and management cluster data with the following steps. This should not impact the currently running nodes as this just updates the node metadata
Note: Please update the exported yaml file with the new value of thumbprint before replacing the secret. As a best practice verify if the secret is updated with the new thumbprint post replace.
To update the TLS thumbprint on each of the workload cluster:
In each of the commands, make sure to replace the string "WC" with your Workload Cluster name.
In the Management Cluster context, update the {clustername}-vsphere-cpi-addon secret in the management cluster context
# Get the actual secret name for your workload cluster kubectl get secret -A | grep cpi-addon
# Save the data values information of the secret into a yaml file. Make sure that the secret name here is correct and the same as the actual secret name as retrieved in the above command. kubectl get secret WC-vsphere-cpi-addon -o jsonpath={.data.values\\.yaml} | base64 -d > WC-vsphere-cpi-addon.yml
# Open the yaml file in your favorite editor and change the thumbprint information.
# Update the secret with the modified yaml file. kubectl create secret generic WC-vsphere-cpi-addon --type=tkg.tanzu.vmware.com/addon --from-file=values.yaml=WC-vsphere-cpi-addon.yml --dry-run=client -o yaml | kubectl replace -f -
# Add labels to the secret kubectl label secret WC-vsphere-cpi-addon tkg.tanzu.vmware.com/cluster-name=WC kubectl label secret WC-vsphere-cpi-addon tkg.tanzu.vmware.com/addon-name=vsphere-cpi
In the Workload Cluster context, verify that the secret vsphere-cpi-data-values in tkg-system namespace has been updated. This should have been reconciled after the above secret has been updated. # The output of this command should show the new thumbprint info kubectl -n tkg-system get secret vsphere-cpi-data-values -o jsonpath={.data.values\\.yaml} | base64 -d | grep -i thumbprint
Verify the configmap is updated using the below command on the workload cluster context:
# The output of this command should show the new thumbprint info kubectl -n kube-system get cm vsphere-cloud-config -o yaml
Restart the vsphere-cloud-controller-manager pod so that the new configmap is mounted
Note that he procedures above should be performed in each Workload Cluster.
To update the TLS thumbprint on the management cluster:
In each of the following commands, make sure to replace the string "MC" with your Management Cluster name.
In the Management Cluster context, update the secret {management-clustername}-vsphere-cpi-data-values secret in tkg-system namespace
# Get the actual secret name for your cluster kubectl -n tkg-system get secret | grep vsphere-cpi
# Save the data values information of the secret into a yaml file. Make sure that the secret name here is correct and the same as the actual secret name as retrieved in the above command kubectl -n tkg-system get secret MC-vsphere-cpi-data-values -o jsonpath={.data.values\\.yaml} | base64 -d > MC-vsphere-cpi-data-values.yml
# Open the yaml file in your favorite editor and change the thumbprint information.
# Update the secret with the modified yaml file. kubectl create secret generic MC-vsphere-cpi-data-values -n tkg-system --type=tkg.tanzu.vmware.com/addon --from-file=values.yaml=MC-vsphere-cpi-data-values.yml --dry-run=client -o yaml | kubectl replace -f -
# Add labels to the secret kubectl label secret MC-vsphere-cpi-data-values -n tkg-system tkg.tanzu.vmware.com/cluster-name=MC kubectl label secret MC-vsphere-cpi-data-values -n tkg-system tkg.tanzu.vmware.com/addon-name=vsphere-cpi
Vsphere TLS Thumbprint also needs to be updated in the "vspherecluster" or "cluster", and "vspherevm" objects. These has to be updated in all clusters.
In the Management Cluster context, list all the vsphereclusters and clusters including the management cluster and note down their names as those will be needed in the next steps.
kubectl get vsphereclusters -A kubectl get clusters -A
For each of the clusters, edit the vspherecluster OR cluster object and update spec.thumbprint .
If it's a legacy (non-classy) Workload Cluster, then edit the vspherecluster object and update the spec.thumbprint.
kubectl edit vspherecluster WC Otherwise, if it's a classy Workload Cluster OR a Management Cluster, then edit the cluster object and update the spec.thumbprint.
kubectl edit cluster WC
Verify if the update is completed using the below command:
kubectl get vspherecluster WC -o yaml
OR
kubectl get cluster WC -o yaml
Restart the vsphere-cloud-controller-manager pod in the kube-system namespace in the Management Cluster.
Scale down the CAPV deployment in the management cluster using the following command:
- Confirm the updates on the VSphereVM objects by checking for the thumbprint update on the output of the VSphereVM objects by running the following commands:
Update the workload cluster and management cluster data with the following steps. This should not impact the currently running nodes as this just updates the node metadata
Note: Please update the exported yaml file with the new value of thumbprint before replacing the secret. As a best practice verify if the secret is updated with the new thumbprint post replace.
To update the TLS thumbprint on each of the workload cluster:
In each of the commands, make sure to replace the string "WC" with your Workload Cluster name.
In the Management Cluster context, update the {clustername}-vsphere-cpi-addon secret in the management cluster context
# Get the actual secret name for your workload cluster kubectl get secret -A | grep cpi-addon
# Save the data values information of the secret into a yaml file. Make sure that the secret name here is correct and the same as the actual secret name as retrieved in the above command. kubectl get secret WC-vsphere-cpi-addon -o jsonpath={.data.values\\.yaml} | base64 -d > WC-vsphere-cpi-addon.yml
# Open the yaml file in your favorite editor and change the thumbprint information.
# Update the secret with the modified yaml file. kubectl create secret generic WC-vsphere-cpi-addon --type=tkg.tanzu.vmware.com/addon --from-file=values.yaml=WC-vsphere-cpi-addon.yml --dry-run=client -o yaml | kubectl replace -f -
# Add labels to the secret kubectl label secret WC-vsphere-cpi-addon tkg.tanzu.vmware.com/cluster-name=WC kubectl label secret WC-vsphere-cpi-addon tkg.tanzu.vmware.com/addon-name=vsphere-cpi
In the Workload Cluster context, verify that the secret vsphere-cpi-data-values in tkg-system namespace has been updated. This should have been reconciled after the above secret has been updated. # The output of this command should show the new thumbprint info kubectl -n tkg-system get secret vsphere-cpi-data-values -o jsonpath={.data.values\\.yaml} | base64 -d | grep -i thumbprint
Verify the configmap is updated using the below command on the workload cluster context:
# The output of this command should show the new thumbprint info kubectl -n kube-system get cm vsphere-cloud-config -o yaml
Restart the vsphere-cloud-controller-manager pod so that the new configmap is mounted
Note that he procedures above should be performed in each Workload Cluster.
To update the TLS thumbprint on the management cluster:
In each of the following commands, make sure to replace the string "MC" with your Management Cluster name.
In the Management Cluster context, update the {management-clustername}-vsphere-cpi-addon secret in the tkg-system namespace
# Get the actual secret name for your Management Cluster kubectl -n tkg-system get secret | grep vsphere-cpi
# Save the data values information of the secret into a yaml file. Make sure that the secret name here is correct and the same as the actual secret name as retrieved in the above command. kubectl -n tkg-system get secret MC-vsphere-cpi-addon -o jsonpath={.data.values\\.yaml} | base64 -d > MC-vsphere-cpi-addon.yml
# Open the yaml file in your favorite editor and change the thumbprint information.
# Update the secret with the modified yaml file. kubectl create secret generic MC-vsphere-cpi-addon -n tkg-system --type=tkg.tanzu.vmware.com/addon --from-file=values.yaml=MC-vsphere-cpi-addon.yml --dry-run=client -o yaml | kubectl replace -f -
# Add labels to the secret kubectl label secret MC-vsphere-cpi-addon -n tkg-system tkg.tanzu.vmware.com/cluster-name=MC kubectl label secret MC-vsphere-cpi-addon -n tkg-system tkg.tanzu.vmware.com/addon-name=vsphere-cpi
Verify the configmap is updated using the below command:
kubectl -n kube-system get cm vsphere-cloud-config -o yaml
Restart the vsphere-cloud-controller-manager pod so that the new configmap is mounted
Vsphere TLS Thumbprint also needs to be updated in the "vspherecluster" and "vspherevm" objects. These has to be updated in all clusters.
In the Management Cluster context, list all the vsphereclusters including the management cluster and note down their names as those will be needed in the next steps.
kubectl get vsphereclusters -A NAMESPACE NAME AGE default tkg-test 62d default tkg-wld 83d tkg-system tkg-mgmt 83d
For each of the clusters, edit the vspherecluster CR and update spec.thumbprint.
kubectl edit vspherecluster WC
Verify if the update is completed using the below command:
kubectl get vspherecluster WC -o yaml
Scale down the CAPV deployment in the management cluster context using the following command: kubectl scale deploy -n capv-system capv-controller-manager --replicas=0
Update the CAPV validating webhook configuration in the management cluster to allow updates to the VSphereVM objects:
- Confirm the updates on the VSphereVM objects by checking for the thumbprint update on the output of the VSphereVM objects by running the following commands: