Error: "unable to install kapp-controller to bootstrap cluster" while deploying management cluster
search cancel

Error: "unable to install kapp-controller to bootstrap cluster" while deploying management cluster

book

Article ID: 426825

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Management

Issue/Introduction

When attempting to deploy a new Tanzu Management Cluster it fails at the following step:

Writing configuration
Starting control-plane
Installing CNI
Installing StorageClass
Waiting 2m0s for control-plane = Ready
Ready after 26s
Bootstrapper created. Kubeconfig: /root/.kube-tkg/tmp/config_######  -----> Temporary kube config path

Kapp-controller configuration file: /tmp/###########
waiting for resource kapp-controller of type *v1.Deployment to be up and running
pods are not yet running for deployment 'kapp-controller' in namespace 'tkg-system', retrying

 

Environment

2.x

Cause

  • Logging in to the temporary bootstrap cluster and describing the pod shows that there is a certificate error

export /root/.kube-tkg/tmp/config_####### ---> kubeconfig path from the deployment failure error

kubectl describe pod -n tkg-system -l app=kapp-controller showed that the temporary cluster was unable to pull the image from the harbor registry due to certificate error

Warning
###############": failed to pull and unpack image "#############/tkg/packages/core/kapp-controller@sha#################": failed to resolve reference "#############/tkg/packages/core/kapp-controller@sha#################": failed to do request: Head "https://<harbor fqdn>/v2/tkg/packages/core/kapp-controller/manifests/sha256:######################": tls: failed to verify certificate: x509: certificate signed by unknown authority

  • The bootstrap cluster (the temporary Kind cluster) is trying to pull images from the harbor registry.
  • However, the Kind nodes do not trust the security certificate of the harbor.

Resolution

Option A

Update the harbor certificate into the bootstrap cluster configuration using the below steps

  1. Identify the harbor certificate and convert it into base 64 format

    cat /path/to/your/ca.crt | base64 -w 0

  2. Update the the harbor certificate to the TKG configuration (config.yaml) located at .config/tanzu/tkg/config.yaml or the specific configuration file passed with -f during deployment. 

    Paste the Base64 string  copied in Step 1 at the top and save the config file. 

    TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE:"<base 64 format of the certificate>".

  3. Identify and delete the old temporary cluster.

    Kind get clusters

    kind delete cluster --name <Cluster name>
     

Option B 

Update the TKG configuration (config.yaml) to skip the certificate validation.

  1. Open the config.yaml file located at .config/tanzu/tkg/config.yaml and add the below variable at the top and  save  the  config file.

    TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY: true

  2. Identify and delete the old temporary cluster.

    Kind get clusters

    kind delete cluster --name <Cluster name>

Once following either Option A or Option B, retry the deployment again. 

 

Additional Information

If the above options do not resolve the issue, then check the environment variable for the machine where the terminal is open, and include the parameter TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE:"<base 64 format of the certificate>"to include the certificate or include the parameter TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY: true to omit the certificate validation.

Note: The resolution implemented whether (Option A or option B ) to the config.yaml and to the environment variable must be the same.