When attempting to deploy a new vSphere Kubernetes Service (VKS) Guest Cluster, the deployment stalls and fails to complete. You may observe the following behaviors across your infrastructure:
"Failed to update kube-proxy daemonset" err="failed to determine if kube-proxy daemonset already exists: Get \https://<VIP_IP>:6443/apis/apps/v1/namespaces/kube-system/daemonsets/kube-proxy?timeout=10s\: call timeout expired - error from a previous attempt: http2: client connection lost" controller="kubeadmcontrolplane" controllerGroup="controlplane.cluster.x-k8s.io" controllerKind="KubeadmControlPlane" KubeadmControlPlane="<NAMESPACE>/<CLUSTER_NAME>" namespace="<NAMESPACE>" name="<CLUSTER_NAME>" reconcileID="<RECONCILE_ID>" Cluster="<NAMESPACE>/<CLUSTER_NAME>" E0518 08:56:47.583827 1 controller.go:347] "Reconciler error" err="failed to determine if kube-proxy daemonset already exists: Get \https://<VIP_IP>:6443/apis/apps/v1/namespaces/kube-system/daemonsets/kube-proxy?timeout=10s\: call timeout expired - error from a previous attempt: http2: client connection lost"
This issue is caused by an IP allocation failure at the load balancer layer. The existing Service Engines within the designated Service Engine Group (SEG) are unable to acquire an IP address from the configured subnet.
Because the Service Engine cannot obtain a valid IP, the Virtual Service (which acts as the front-end VIP for the cluster API) fails to place. Consequently, the VKS deployment engine times out waiting for the Kubernetes API to become reachable via the load balancer, halting the cluster creation process.
Workaround:
To immediately bypass the IP acquisition failure on the existing Service Engines and unblock the VKS deployment, you must create a new Service Engine Group. This forces the Avi Controller to instantiate a fresh Service Engine.
Step-by-Step Implementation:
Once the Virtual Service is successfully placed, the VKS deployment will automatically resume and complete.