Error: "failed with newly created client is not active" during Node Pool creation/customization
search cancel

Error: "failed with newly created client is not active" during Node Pool creation/customization

book

Article ID: 399339

calendar_today

Updated On:

Products

VMware Telco Cloud Automation VMware Telco Cloud Platform

Issue/Introduction

  • Node Pool creation/customization stuck in processing

  • Node stuck in reconfiguration

  • "Show more" under the node pool’s CR view show:

    Get client for mc <mc_name> failed with newly created client is not active

  • kubectl logs -n tca-cp-cn kbs-###-####-##### show error:

    2025-05-20T07:38:28.769263519Z stderr F 2025-05-20T07:38:28.756715Z     error   nodepolicy/mchelper.go:327      Get client for mc <mc_name> failed with newly created client is not active

  • Tkg operator logs show:
    stdout F Apr xx 19:28:52 [1] : [Err-controller] : error getting rest config for management cluster context, err : [context "<mc_name>-admin@<mc_name>" does not exist]
    stdout F Apr xx 19:28:52 [1] : [Err-controller] : unable to get kubeconfig of cluster [######-####-####-####-########:<mc_name>], err: management cluster [<mc_name>] rest config is nil.

  • Under Policies in CaaS Infrastructure for the workload cluster, the node policy will show in "Provisioning" state or "Deleting" State.

Environment

TCA: 3.2, 3.3

TCP: 5.0, 5.0.1

Cause

  • TCA-CP K8S database contains an outdated management cluster kubeconfig. When the application engine attempts to communicate with the management cluster via K8s, it uses the old kubeconfig.

  • This issue occurs because not all databases within TCA-CP that store the management cluster's kubeconfig were updated during the certificate auto-renewal process. Ideally, the renewed kubeconfig should be propagated to all relevant databases used by various services.

  • Starting with TCA 3.2, the kubeconfig is stored in databases, whereas in earlier versions it was stored on the TCA-CP file system.

Resolution

Resolved in TCA 3.3.0.1 and TCA 3.4

Workaround

For 3.2: Apply the Patch Tool for TCA 3.2.0.1 KB.  

For 3.3: Update the Management Cluster Kubeconfig in TCA-CP Database

  1. Retrieve the Kubeconfig

    1. SSH into the management cluster control plane node using the capv user

    2. Get the kubeconfig secret content for the management cluster
      kubectl get secret <cluster name>-kubeconfig -n tkg-system -ojsonpath="{.data.value}" 

  2. Retrieve the Management Cluster ID

    1. SSH into the TCA-CP node as the admin user.

    2. Run the following commands to obtain the management cluster ID:

      export MC_NAME={management_cluster_name}
      MC_POD_IP=$(kubectl get pod -A -o wide | grep k8s-bootstrapper | awk '{print $9}')
      curl http://${MC_POD_IP}:8888/api/v1/managementclusters | jq -r --arg name "$MC_NAME" '.[] | select(.clusterName == $name) | .id'

      Note: Replace {management_cluster_name} with your actual management cluster name.

  3. Update the TCA-CP Database

    1. Access the PostgreSQL pod:
      kubectl exec -it -n tca-cp-cn postgres-0 -- bash

    2. Launch the PostgreSQL shell:
      psql -t caas_spoke

    3. Disable the pager (optional but recommended for large outputs):
      \pset pager off

    4. Update the base64-encoded kubeconfig in the database:

      UPDATE management_cluster_kube_configs

      SET val = '{valid_kubeconfig_base64_encoded_string}'

      WHERE id = '{management_cluster_id}';

      Note:

      • {valid_kubeconfig_base64_encoded_string}: Replace this with the base64-encoded kubeconfig obtained in Step 1.

      • {management_cluster_id}: Replace this with the ID retrieved in Step 2.