When upgrading a VMware vSphere Kubernetes Service (VKS) cluster to version 3.6 (as part of a VCF 9.1 upgrade), the cluster reconciliation may fail if the environment has been transitioned from Single-Zone to Multi-Zone. The Cluster status reports TopologyReconciled: False.
Symptoms:
Running kubectl describe cluster <Cluster Name> returns the following:
Status:
Conditions:
Last Transition Time: 2026-01-08T17:16:34Z
Message: error computing the desired state of the Cluster topology: failed to apply patches: failed to generate patches for patch "default": failed to call extension handler "generate-patches.runtime-extension": got failure response
Reason: TopologyReconcileFailed
Severity: Error
Status: False
Type: TopologyReconciled
The runtime-extension-controller-manager logs contain the following denial from the admission webhook: 2026-01-09T06:01:30 handler.go:129] "error during patch generation" err="unable to patch cluster with resolved KR data: admission webhook \"capi.validating.tanzukubernetescluster.run.tanzu.vmware.com\" denied the request: spec.topology.workers.machineDeployments[0].variables.volumes[0].storageClass: Invalid value: \"<StorageClass Name>\": StorageClass \"<StorageClass Name>\" uses \"Immediate\" binding mode; WaitForFirstConsumer (latebinding suffix) binding mode is required in multi-zone environment if no failureDomain is specified for MachineDeployment \"<Node Pool Name>\" at path spec.topology.workers.machineDeployments[0]"
VMware Cloud Foundation 9.1
VMware vSphere Kubernetes Service (VKS) 3.6
Supervisor configured with Multi-Zone support
Starting with VKS 3.6, enhanced validation is enforced via the Runtime Extension during cluster updates/upgrades. In a Multi-Zone setup, StorageClasses must use WaitForFirstConsumer (WFFC) binding mode to ensure volumes are provisioned in the correct availability zone. If a cluster was originally created in a Single-Zone environment using "Immediate" binding and subsequently moved to a Multi-Zone environment, the upgrade-triggered patch will fail this validation.
To resolve this issue, the Cluster specification must be updated to meet Multi-Zone requirements. Note that these changes will trigger a rolling update of the node pools.
Identify affected clusters: Check for clusters with TopologyReconciled: False following the VKS 3.6 upgrade.
Modify Cluster Specification:
Option 1 (Recommended): Update the storageClass for the affected Node Pools to use a version with the -latebinding suffix (which uses WaitForFirstConsumer binding).
Option 2: Explicitly define a failureDomain within the machineDeployments section of the Cluster spec for each node pool.
Apply Changes: Save the updated Cluster configuration. The Runtime Extension will successfully generate patches, and the cluster will proceed with a rolling update to reconcile the new topology.
This validation is intended to prevent persistent volume binding failures that occur when an "Immediate" volume is provisioned in a zone that does not match the node where the pod is eventually scheduled.