The environment where the Tanzu Kubernetes Grid (TKG) management cluster is being deployed is air-gapped or internet restricted.
Management cluster creation fails with the components failing on the kind cluster
You see messages similar to the following in the kubelet logs in the kind cluster:
Jul 21 10:34:16 tkg-kind-c3rvc9c2ebhs2blepc6g-control-plane kubelet[585]: E0721 10:34:16.734509 585 pod_workers.go:191] Error syncing pod 51f2dab2-264e-493e-9f91-812b81
b761ba ("kube-proxy-zkcnz_kube-system(51f2dab2-264e-493e-9f91-812b81b761ba)"), skipping: failed to "StartContainer" for "kube-proxy" with CrashLoopBackOff: "back-off 5m0s r
estarting failed container=kube-proxy pod=kube-proxy-zkcnz_kube-system(51f2dab2-264e-493e-9f91-812b81b761ba)"
Jul 21 10:34:16 tkg-kind-c3rvc9c2ebhs2blepc6g-control-plane kubelet[585]: E0721 10:34:16.965954 585 manager.go:1123] Failed to create existing container: /docker/9f6a79
717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e: failed to identify the read-write layer ID for container "9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8
d20c9ab17e". - open /var/lib/docker/image/overlay2/layerdb/mounts/9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e/mount-id: no such file or directory
Jul 21 10:34:16 tkg-kind-c3rvc9c2ebhs2blepc6g-control-plane kubelet[585]: E0721 10:34:16.967054 585 manager.go:1123] Failed to create existing container: /docker/9f6a79
717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e/docker/9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e: failed to identify the read-write layer
ID for container "9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e". - open /var/lib/docker/image/overlay2/layerdb/mounts/9f6a79717cf271bed7546c36a900c15d77
e1c0513b553d2baecfc8d20c9ab17e/mount-id: no such file or directory
2021-07-21T10:58:39.962429341Z stderr F I0721 10:58:39.961983 1 node.go:172] Successfully retrieved node IP: 1##.##.#.#
2021-07-21T10:58:39.96247252Z stderr F I0721 10:58:39.962062 1 server_others.go:142] kube-proxy node IP is an IPv4 address (1##.##.#.#), assume IPv4 operation
2021-07-21T10:58:39.975832251Z stderr F W0721 10:58:39.975625 1 server_others.go:578] Unknown proxy mode "", assuming iptables proxy
2021-07-21T10:58:39.975853193Z stderr F I0721 10:58:39.975733 1 server_others.go:185] Using iptables Proxier.
2021-07-21T10:58:39.976165859Z stderr F I0721 10:58:39.976040 1 server.go:650] Version: v1.20.5+vmware.1
2021-07-21T10:58:39.976482573Z stderr F I0721 10:58:39.976419 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
2021-07-21T10:58:39.976491689Z stderr F F0721 10:58:39.976441 1 server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied
{"mode":"full","isActive":false}
Note: You can view the logs from the kind cluster by using the kubectl logs command with the kubeconfig file located in the .kube-tkg/tmp folder.
kind cluster creation as per the steps documented fails on air-gapped environments because the CA certificate (encoded in TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE variable) is not injected in to the manually created bootstrap kind cluster.
We need to create the kind cluster along with the configuration for the Harbor registry certs that needed to be updated to the containerd config file.
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: tkg-kind
nodes:
- role: control-plane
# This option mounts the host docker registry folder into
# the control-plane node, allowing containerd to access them.
extraMounts:
- containerPath: /etc/containerd/harbor1.corp.tanzu
hostPath: /etc/docker/certs.d/harbor1.corp.tanzu
containerdConfigPatches:
- |-
[plugins."io.containerd.grpc.v1.cri".registry.configs."harbor1.corp.tanzu".tls]
ca_file = "/etc/containerd/harbor1.corp.tanzu/ca.crt"
Notes:
ca_file
-- CA certificate of Harbor registryhostPath
-- path of Harbor CA certificate on the bootstrap VM where the kind cluster is createdcontainerPath
-- path of Harbor CA certificate on the kind containerharbor.corp.tanzu
" with your registry nametkg-kind
" because in case this kind cluster fails to bootstrap it will be easy to collect a crashd log bundle as by default crashd looks for the kind cluster with this name.
kind create cluster --config kind.yml
Note: You can exec into the kind container if required and check if the file /etc/containerd/config.toml
has been populated with the expected registry info and certificate data. Optionally, you can try pulling an image from the registry using a command similar to the following:
crictl pull <imagename>
tanzu management-cluster create
command, similar to the following:tanzu management-cluster create --file vsphere-mc.yaml --use-existing-bootstrap-cluster