How to use an existing kind cluster to deploy management clusters in air-gapped Tanzu Kubernetes Grid environments
search cancel

How to use an existing kind cluster to deploy management clusters in air-gapped Tanzu Kubernetes Grid environments

book

Article ID: 316962

calendar_today

Updated On:

Products

Tanzu Kubernetes Grid

Issue/Introduction

Symptoms:
  • The environment where the Tanzu Kubernetes Grid (TKG) management cluster is being deployed is air-gapped or internet restricted.

  • Management cluster creation fails with the components failing on the kind cluster

  • You see messages similar to the following in the kubelet logs in the kind cluster:

Jul 21 10:34:16 tkg-kind-c3rvc9c2ebhs2blepc6g-control-plane kubelet[585]: E0721 10:34:16.734509     585 pod_workers.go:191] Error syncing pod 51f2dab2-264e-493e-9f91-812b81b761ba ("kube-proxy-zkcnz_kube-system(51f2dab2-264e-493e-9f91-812b81b761ba)"), skipping: failed to "StartContainer" for "kube-proxy" with CrashLoopBackOff: "back-off 5m0s restarting failed container=kube-proxy pod=kube-proxy-zkcnz_kube-system(51f2dab2-264e-493e-9f91-812b81b761ba)"
Jul 21 10:34:16 tkg-kind-c3rvc9c2ebhs2blepc6g-control-plane kubelet[585]: E0721 10:34:16.965954     585 manager.go:1123] Failed to create existing container: /docker/9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e: failed to identify the read-write layer ID for container "9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e". - open /var/lib/docker/image/overlay2/layerdb/mounts/9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e/mount-id: no such file or directory
Jul 21 10:34:16 tkg-kind-c3rvc9c2ebhs2blepc6g-control-plane kubelet[585]: E0721 10:34:16.967054     585 manager.go:1123] Failed to create existing container: /docker/9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e/docker/9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e: failed to identify the read-write layerID for container "9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e". - open /var/lib/docker/image/overlay2/layerdb/mounts/9f6a79717cf271bed7546c36a900c15d77e1c0513b553d2baecfc8d20c9ab17e/mount-id: no such file or directory

  • You see messages similar to the following in the kube-proxy logs in the kind cluster:

2021-07-21T10:58:39.962429341Z stderr F I0721 10:58:39.961983       1 node.go:172] Successfully retrieved node IP: 1##.##.#.#
2021-07-21T10:58:39.96247252Z stderr F I0721 10:58:39.962062       1 server_others.go:142] kube-proxy node IP is an IPv4 address (1##.##.#.#), assume IPv4 operation
2021-07-21T10:58:39.975832251Z stderr F W0721 10:58:39.975625       1 server_others.go:578] Unknown proxy mode "", assuming iptables proxy
2021-07-21T10:58:39.975853193Z stderr F I0721 10:58:39.975733       1 server_others.go:185] Using iptables Proxier.
2021-07-21T10:58:39.976165859Z stderr F I0721 10:58:39.976040       1 server.go:650] Version: v1.20.5+vmware.1
2021-07-21T10:58:39.976482573Z stderr F I0721 10:58:39.976419       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
2021-07-21T10:58:39.976491689Z stderr F F0721 10:58:39.976441       1 server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied
{"mode":"full","isActive":false}

Note: You can view the logs from the kind cluster by using the kubectl logs command with the kubeconfig file located in the .kube-tkg/tmp folder.

Environment

VMware Tanzu Kubernetes Grid Plus 1.x

Cause

kind cluster creation as per the steps documented fails on air-gapped environments because the CA certificate (encoded in TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE variable) is not injected in to the manually created bootstrap kind cluster.

 

Resolution


We need to create the kind cluster along with the configuration for the Harbor registry certs that needed to be updated to the containerd config file.
 

  1. Create a kind configuration file similar to the following

    kind: Cluster

    apiVersion: kind.x-k8s.io/v1alpha4
    name: tkg-kind
    nodes:
      - role: control-plane
        # This option mounts the host docker registry folder into
        # the control-plane node, allowing containerd to access them.
        extraMounts:
          - containerPath: /etc/containerd/harbor1.corp.tanzu
            hostPath: /etc/docker/certs.d/harbor1.corp.tanzu
    containerdConfigPatches:
      - |-
        [plugins."io.containerd.grpc.v1.cri".registry.configs."harbor1.corp.tanzu".tls]
          ca_file = "/etc/containerd/harbor1.corp.tanzu/ca.crt"


Notes:

    • ca_file -- CA certificate of Harbor registry
    • hostPath -- path of Harbor CA certificate on the bootstrap VM where the kind cluster is created
    • containerPath -- path of Harbor CA certificate on the kind container
    • Replace the registry name "harbor.corp.tanzu" with your registry name
    • The name of the kind cluster as "tkg-kind" because in case this kind cluster fails to bootstrap it will be easy to collect a crashd log bundle as by default crashd looks for the kind cluster with this name.

 

  1. Issue a command similar to the following to create the kind cluster:

kind create cluster --config kind.yml

Note: You can exec into the kind container if required and check if the file /etc/containerd/config.toml has been populated with the expected registry info and certificate data. Optionally, you can try pulling an image from the registry using a command similar to the following:

crictl pull <imagename>

  1. You can proceed with creating the Management cluster using the tanzu management-cluster create command, similar to the following:

tanzu management-cluster create --file vsphere-mc.yaml --use-existing-bootstrap-cluster