When upgrading a TKGm cluster from Kubernetes 1.24.10 to 1.25.7, the control plane and worker node upgrades complete successfully. However, the overall lifecycle task fails during the PostConfig phase with below error message :
Upgrade failed at postconfig phase. failed to create tca-kubecluster-operator dummy CR on cluster, err:: error: no context exists with the name: "<MC-name-admin@MC-name>". You can retry to upgrade the cluster.
TCA 3.2
The issue is caused by the wrong kubeconfig string stored within the caas-spoke pod database layer.
Update the Management Cluster Kubeconfig in caas_spoke pod by following below steps:
Retrieve the Kubeconfig
SSH into the management cluster control plane node using the capv user
Get the kubeconfig secret content for the management clusterkubectl get secret <cluster name>-kubeconfig -n tkg-system -ojsonpath="{.data.value}"
Retrieve the Management Cluster ID
SSH into the TCA-CP node as the admin user.
Run the following commands to obtain the management cluster ID:
export MC_NAME={management_cluster_name}
MC_POD_IP=$(kubectl get pod -A -o wide | grep k8s-bootstrapper | awk '{print $9}')
curl http://${MC_POD_IP}:8888/api/v1/managementclusters | jq -r --arg name "$MC_NAME" '.[] | select(.clusterName == $name) | .id'Note: Replace {management_cluster_name} with your actual management cluster name.
Update the TCA-CP Database
Access the PostgreSQL pod:kubectl exec -it -n tca-cp-cn postgres-0 -- bash
Launch the PostgreSQL shell:psql -t caas_spoke
Disable the pager (optional but recommended for large outputs):\pset pager off
Update the base64-encoded kubeconfig in the database:
UPDATE management_cluster_kube_configs
SET val = '{valid_kubeconfig_base64_encoded_string}'
WHERE id = '{management_cluster_id}';
Note:
{valid_kubeconfig_base64_encoded_string}: Replace this with the base64-encoded kubeconfig obtained in Step 1.
{management_cluster_id}: Replace this with the ID retrieved in Step 2.