vSphere Kubernetes Cluster in TopologyReconcileFailed state - error computing the desired state of the Cluster topology
search cancel

vSphere Kubernetes Cluster in TopologyReconcileFailed state - error computing the desired state of the Cluster topology

book

Article ID: 389259

calendar_today

Updated On:

Products

VMware vSphere 7.0 with Tanzu vSphere with Tanzu VMware vSphere Kubernetes Service Tanzu Kubernetes Runtime

Issue/Introduction

vSphere Kubernetes Cluster is in an error state of TopologyReconcileFailed.

When an upgrade is initiated while the cluster is in this state, the upgrade becomes stuck and cannot progress.

 

While connected to the Supervisor cluster context, the following symptoms are present:

  • Performing a describe on the cluster object shows a status similar to the below:
    • kubectl describe cluster <cluster name> -n <cluster namespace>
    • message: 'error computing the desired state of the Cluster topology: failed to apply patches'
      reason: TopologyReconcileFailed
      severity: Error
      status: "False"
      type: TopologyReconciled

  • The above detailed error message is similar to one of the following error messages where values enclosed in <> will vary:
    • 'error computing the desired state of the Cluster topology: failed to apply patches: failed to generate patches for patch "nodeLabels": failed to generate JSON patches for "<Template>": failed to calculate value for template: failed to render template: "run.tanzu.vmware.com/tkr={{ index (index .TKR_DATA .builtin.controlPlane.version).labels \"run.tanzu.vmware.com/tkr\"}},run.tanzu.vmware.com/kubernetesDistributionVersion={{ index (index .TKR_DATA.builtin.controlPlane.version).labels \"run.tanzu.vmware.com/tkr\" }}\n": template: tpl:1:28: executing "tpl" at <index (index .TKR_DATA .builtin.<nodeObject>.version).labels "run.tanzu.vmware.com/tkr">: error calling index: index of untyped nil'

    • error computing the desired state of the Cluster topology: failed to apply patches: failed to generate patches for patch "nodeLabels": [failed to generate JSON patches for item with uid "<uid>": failed to calculate value for template: failed to render template: "run.tanzu.vmware.com/tkr={{ index (index .TKR_DATA .builtin.<nodeComponent>.version).labels \"run.tanzu.vmware.com/tkr\" }}

  • If a vSphere Kubernetes cluster's upgrade was initiated and appears stuck:
    • The cluster shows the desired TKR version:
      • kubectl describe cluster <cluster name> -n <cluster namespace>

      • For non-classy clusters, confirm also regarding the TKC object's version:
        • kubectl get tkc <cluster name> -n <cluster namespace>

    • The kubeadmcontrolplane (kcp), machinedeployment (md), machine objects for the stuck upgrading vSphere Kubernetes cluster still show the previous TKR version:
      • kubectl get kcp,md,machine -n <cluster namespace>

Environment

vSphere 7.0 with Tanzu
 
vSphere 8.0 with Tanzu
 
This issue can occur on a vSphere Supervisor cluster regardless of whether or not it is managed by Tanzu Mission Control (TMC)

Cause

There is a misconfiguration or missing TKR_DATA in the TKR_DATA section of the affected vSphere Kubernetes cluster object.

TopologyReconcileFailed state can also occur if one of the node components are incorrectly configured.

This is caused by issues during a cluster upgrade which did not properly populate the TKR_DATA or by manual misconfiguration on the affected node component.

Resolution

The TKR_DATA section of the affected cluster's cluster YAML or the misconfigured node component will need to be corrected.

Once the cause is corrected, the TopologyReconcileFailed error should clear and the cluster yaml will show TopologyReconciled status "True" state.

Please reach to VMware by Broadcom Technical Support referencing this KB article to resolve this.