kubectl "Unable to connect to the server" errors due to kube-vip pods not running in TKG cluster's Control Plane

Products

VMware Tanzu Kubernetes Grid Tanzu Kubernetes Grid

Issue/Introduction

kube-vip are static pods running on each TKG cluster's Control Plane node. The kube-vip pods assign a cluster VIP address to one of the Control Plane nodes. This VIP address is the one used by kubectl and internal TKG components by default to access cluster resources.
Static pod manifests can be found in /etc/kubernetes/manifests/kube-vip.yaml inside all Control Plane nodes.
If, for any reason, the kube-vip pods aren't running, the Control Plane will lose its VIP address and default kubectl commands will return "Unable to connect to the server" errors as below:

$ kubectl get node
Unable to connect to the server: dial tcp <VIP_ADDRESS>:6443: i/o timeout

or

$ kubectl get node
Unable to connect to the server: dial tcp <VIP_ADDRESS>:6443: connect: no route to host
Common failures for kube-vip pods arise in air-gapped environments where the image registry is updated from a custom registry to the default projects.registry.vmware.com location. Image pull failures will report the following in kubelet logs and when describing the pod:

err: "failed to \"StartContainer\" for \"kube-vip\" with ImagePullBackOff: \"Back-off pulling image \\\"projects.registry.vmware.com/tkg/kube-vip:v0.5.12_vmware.1\\\"\"
Additionally, the Management Cluster may experience issues managing the impacted Workload Cluster, as ClusterAPI components make use of the Workload Cluster's VIP to perform certain operations.

Cause

There can be multiple reasons why kube-vip pods are not running in the Control Plane nodes.

This condition is most commonly encountered in air-gapped environments and appears because the /etc/kubernetes/manifests/kube-vip.yaml is configured to pull images from the common projects.registry.vmware.com registry instead of the private registry configured during cluster creation.

Resolution

For the most commonly encountered kube-vip pod failure condition noted above, editing the imageRegistry line in the /etc/kubernetes/manifests/kube-vip.yaml manifest to point to the correct custom image repository should bring up the kube-vip pods.

If modifying the /etc/kubernetes/manifests/kube-vip.yaml doesn't correct the image registry ImagePullBackOff errors, it's possible the image will need to be tagged with the default image repository to allow image pull. Use the following commands to tag the existing image with a custom image repo:

1. List the existing image:
  
  $ ctr -n=k8s.io image ls | grep vip
  <CUSTOM_REGISTRY_URL>.fqdn.com/repository/tkg/kube-vip:v0.5.12_vmware.1
2. Tag the existing image:
  
  $ ctr -n=k8s.io image tag <CUSTOM_REGISTRY_URL>.fqdn.com/repository/tkg/kube-vip:v0.5.12_vmware.1 projects.registry.vmware.com/tkg/kube-vip:v0.5.12_vmware.1
  This should allow the image to pull and should recover kube-vip pod accessibility

If access to kubectl commands is required urgently, please reference the Additional Information below for steps to recover accessibility to kubectl commands while investigating kube-vip functionality.

Additional Information

This below process focuses on recovering the capacity of running kubectl commands, not on troubleshooting the underlying kube-vip issue.

In order to be able to run kubectl commands even if the cluster VIP is not assigned, you can update the kubeconfig file's clusters.cluster.server field with one of the existing eth0 vNIC IP addresses assigned to a Control Plane node.

For example, in the case below the Control Plane node has two IPv4 addresses assigned to eth0, <>.24 and <>.9.

root@workload-<cluster_name>-control-plane-lmn8b [ ~ ]# ip a s
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state UP group default qlen 1000
inet <>.24/26 metric 1024 brd <>.63 scope global dynamic eth0
inet <>.9/32 scope global eth0

<>.24 corresponds to the vSphere VM's vNIC and <>.9 is the cluster VIP address.

If kube-vip pods go down, the <>.9 IP address disappears

root@workload-<cluster_name>-control-plane-lmn8b [ ~ ]# ip a s
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state UP group default qlen 1000
inet <>.24/26 metric 1024 brd <>.63 scope global dynamic eth0

At this stage, running kubectl commands will return:

$ kubectl get node
Unable to connect to the server: dial tcp <>.9:6443: i/o timeout

or

$ kubectl get node
Unable to connect to the server: dial tcp <>.9:6443: connect: no route to host

Workaround

First; check that the ETCD cluster status is healthy.

Inside a Control Plane node, execute:

$ sudo -i
# alias etcdctl="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/*/fs/usr/local/bin/etcdctl --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key --cacert /etc/kubernetes/pki/etcd/ca.crt"
# etcdctl member list -w table
# etcdctl endpoint health status --cluster=true -w table

Example of a healthy 3 Control Plane nodes cluster:

# etcdctl member list -w table
+------------------+---------+--------------------------------------+------------------------------------------+------------------------------------------+------------+

|        ID        | STATUS  |                 NAME                 |                PEER ADDRS                |               CLIENT ADDRS               | IS LEARNER |

+------------------+---------+--------------------------------------+------------------------------------------+------------------------------------------+------------+

| 17f206fd866fdab2 | started | d5e989cf-2242-44b2-bca1-d922d1627543 | https://master-0.etcd.cfcr.internal:2380 | https://master-0.etcd.cfcr.internal:2379 |      false |

| 1958063b94f7906b | started | e9753a70-ba7f-43f6-b3e1-b0030290a977 | https://master-1.etcd.cfcr.internal:2380 | https://master-1.etcd.cfcr.internal:2379 |      false |

| 96d74f332197fd97 | started | 6e664768-fbf6-424a-b808-b4b7bb3c7a12 | https://master-2.etcd.cfcr.internal:2380 | https://master-2.etcd.cfcr.internal:2379 |      false |

+------------------+---------+--------------------------------------+------------------------------------------+------------------------------------------+------------+ 

# etcdctl endpoint health status --cluster=true -w table
+----------------------------+--------+-------------+-------+
|          ENDPOINT          | HEALTH |    TOOK     | ERROR |
+----------------------------+--------+-------------+-------+
| https://10.xxx.xx.xxx:2379 |   true | 15.725849ms |       |
| https://10.xxx.xx.xxx:2379 |   true | 17.235013ms |       |
| https://10.xxx.xx.xxx:2379 |   true | 18.253567ms |       |
+----------------------------+--------+-------------+-------+

If using kubectl from an external client/jumpbox; edit the $HOME/.kube/config and substitute <>.9 or its corresponding FQDN with the existing vNIC IP, <>.24.

Example:

$ cp -p $HOME/.kube/config ./config_bkp
$ vim $HOME/.kube/config
- cluster:
certificate-authority-data: <>
server: https://<>.24:6443
$ kubectl get node
NAME STATUS ROLES AGE VERSION
workload-<cluster_name>-control-plane-lmn8b NotReady control-plane 18d v1.28.7+vmware.1
workload-<cluster_name>-md-0-c2wkk-cm5j9 NotReady <none> 18d v1.28.7+vmware.1

Note: the nodes' status is NotReady because kubelet is not able to communicate with the kube-apiserver, as it's trying to use the cluster VIP.

If using kubectl inside a Control Plane node SSH; edit /etc/kubernetes/admin.conf and substitute <>.9 or its corresponding FQDN with the existing vNIC IP, <>.24.

Example:

$ cp -p /etc/kubernetes/admin.conf ./admin.conf_bkp
$ vim /etc/kubernetes/admin.conf
- cluster:
certificate-authority-data: <>
server: https://<>.24:6443
$ kubectl get node --kubeconfig /etc/kubernetes/admin.conf
NAME STATUS ROLES AGE VERSION
workload-<>-control-plane-lmn8b NotReady control-plane 18d v1.28.7+vmware.1
workload-<>-md-0-c2wkk-cm5j9 NotReady <none> 18d v1.28.7+vmware.1

Note: the nodes' status is NotReady because kubelet is not able to communicate with the kube-apiserver, as it's trying to use the cluster VIP.