Error: "failed to list cluster class: failed to render clusterclass" when creating vSphere Kubernetes Service clusters via Tanzu Mission Control.
book
Article ID: 436001
calendar_today
Updated On:
Products
VMware vSphere Kubernetes Service
Issue/Introduction
Creating vSphere Kubernetes Service (VKS) clusters fail from Tanzu Mission Control (TMC) and the command-line interface (CLI).
Within the TMC user interface, the following error is displayed: Error: Failed to get management cluster classes: failed to list cluster class: failed to render clusterclass
No worker virtual machines are deployed when deployed from command-line interface (CLI).
Additionally, core VKS components (capi-controller-manager, cert-manager, pinniped) remain stuck in a CrashLoopBackOff.
To validate the status of the VKS component pods, run the following command: kubectl get pods -A | grep -v Running
Attempts to pull logs from these system pods fail due to API routing breakdowns, presenting the following error: Get "https://<node_IP>:10250/...": dial tcp <node_ip>:10250: connect: no route to host
To attempt pulling logs from a specific pod to validate the routing failure, run the following command: kubectl logs <pod_name>
Environment
VMware vSphere Kubernetes Service
Cause
This issue can occur due to a discrepancy between the Supervisor VM's allocated IP address in the vSphere UI and the IP address registered in the internal Kubernetes state. As the internal Kubernetes state contains a stale IP address, Kubelet traffic on port 10250 and DNS resolution for internal webhooks (runtime-extension-webhook-service) fail. This routing breakdown prevents new pods from scheduling and causes the Cluster API (CAPI) controllers responsible for deploying new clusters to crash.
To verify the IP address registered in the internal Kubernetes state and check for discrepancies against the vSphere UI, execute the following command: kubectl get nodes -o wide
Resolution
Reboot the affected Supervisor virtual machine to clear the stale IP address registered in the internal Kubernetes state and restore API routing. Ensure to have a Supervisor Control Plane Backup, prior to the reboot.