VKS Machines are stuck in "Provisioned" state and describing workload cluster reveals kube-proxy daemonset to be missing
search cancel

VKS Machines are stuck in "Provisioned" state and describing workload cluster reveals kube-proxy daemonset to be missing

book

Article ID: 431402

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

From the Supervisor, following can be observed:

  • When checking via "kubectl get machine -n <namespace>", there are one or more nodes in "Provisioned" state. Observing further, nodes are deleted after 120 minutes and re-created.

After connecting to a control plane of the workload cluster via SSH, following can be observed:

  • There might be several system pods like "antrea-agent", "vsphere-csi-node" or "guest-cluster-cloud-provider" be in CrashLoopBackOff state.
  • Accessing the ClusterIP might not work and return "No route to host" or "Connection refused": (Pointing towards antrea not being operational)
    root@<workload-cluster-node>:~# kubectl get service kubernetes
    NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
    kubernetes   ClusterIP   ##.##.##.1   <none>        443/TCP    242d
    root@<workload-cluster-node>:~#

    root@<workload-cluster-node>:~# nc -vz ##.##.##.1 443
    nc: connect to ##.##.##.1 port 443 (tcp) failed: No route to host
    root@<workload-cluster-node>:~#
  • Checking the state of the cluster object via "kubectl describe cluster -n <namespace> <cluster-name>" following error message can be observed:
    [...]
    Message:               unable to retrieve kube-proxy daemonset from the guest cluster: daemonsets.apps "kube-proxy" not found
    Reason:                ProvisioningFailed
    [...]
    Message:               Addon KubeProxy is not ready: unable to retrieve kube-proxy daemonset from the guest cluster: daemonsets.apps "kube-proxy" not found,Addon Antrea is not ready: kapp: Error: Timed out waiting after 30s for resources:
    - apiservice/v1alpha1.stats.antrea.io (apiregistration.k8s.io/v1) cluster
    - apiservice/v1beta2.controlplane.antrea.io (apiregistration.k8s.io/v1) cluster
    - apiservice/v1beta1.system.antrea.io (apiregistration.k8s.io/v1) cluster
    - deployment/antrea-controller (apps/v1) namespace: kube-system
    - deployment/interworking (apps/v1) namespace: vmware-system-antrea
    - daemonset/antrea-agent (apps/v1) namespace: kube-system,Addon Secretgen-Controller is not ready: kapp: Error: Getting app: ...
  • Checking for existence of kube-proxy deployment, there is no such deployment:
    root@<workload-cluster-node>:~# kubectl get -n kube-system daemonset kube-proxy
    Error from server (NotFound): daemonsets.apps "kube-proxy" not found

Environment

vSphere with Kubernetes

Resolution

Please reach out to Broadcom Support for further assistance.

In addition, please also collect a WCP Log Bundle and Workload Cluster log bundle as soon as possible to avoid old logs to be lost. Further information in how to collect these logs, can be found here: Gathering Logs for vSphere with Tanzu.