Error: "failed get cert manager components: failed to list api resources: action failed after 9 attempts: unable to retrieve the complete list of server APIs: data.packaging.carvel.dev/v1alpha1" occurs during kubernetes cluster upgrade
search cancel

Error: "failed get cert manager components: failed to list api resources: action failed after 9 attempts: unable to retrieve the complete list of server APIs: data.packaging.carvel.dev/v1alpha1" occurs during kubernetes cluster upgrade

book

Article ID: 420501

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  •  Kubernetes cluster component upgrade failure when running ./upgrade_cluster_components.sh as documented in Upgrade Kubernetes Components in
  • The following error is observed:

    Error: failed get cert manager components: failed to list api resources: action failed after 9 attempts: unable to retrieve the complete list of server APIs: data.packaging.carvel.dev/v1alpha1: the server is currently unable to handle the request.

  • Attempts to get PackageInstall(pkgi) fail.

    root@<hostname>:~/cluster-upgrade-script# kubectl get pkgi -A
    E1121 12:34:00.433751  248607 memcache.go:287] "Unhandled Error" err="couldn't get resource list for data.packaging.carvel.dev/v1alpha1: the server is currently unable to handle the request"
    E1121 12:34:00.439855  248607 memcache.go:121] "Unhandled Error" err="couldn't get resource list for data.packaging.carvel.dev/v1alpha1: the server is currently unable to handle the request"
    E1121 12:34:00.446123  248607 memcache.go:121] "Unhandled Error" err="couldn't get resource list for data.packaging.carvel.dev/v1alpha1: the server is currently unable to handle the request"
    E1121 12:34:00.452623  248607 memcache.go:121] "Unhandled Error" err="couldn't get resource list for data.packaging.carvel.dev/v1alpha1: the server is currently unable to handle the request"
    NAMESPACE    NAME                   PACKAGE NAME                            PACKAGE VERSION                 DESCRIPTION                                                                           AGE
    tkg-system   antrea                 antrea.tanzu.vmware.com                 1.9.0+vmware.2-tkg.1-advanced   Reconcile failed: the server is currently unable to handle the request (get pack...   400d
    tkg-system   metrics-server         metrics-server.tanzu.vmware.com         0.6.2+vmware.1-tkg.2            Reconcile failed: the server is currently unable to handle the request (get pack...   400d
    tkg-system   secretgen-controller   secretgen-controller.tanzu.vmware.com   0.11.2+vmware.1-tkg.3           Reconcile failed: the server is currently unable to handle the request (get pack...   400d


Environment

VMware Container Service Extension 4.2.3
VMware Cloud Director 10.6.1.x

Cause

The certificate of the packaging APIService has expired. In this case it is "v1alpha1.data.packaging.carvel.dev". The issuer of these certificates is the kapp-controller. The kapp-controller pod is responsible for management and reconciling of installed packages in a cluster. If there is an issue with the kapp-controller pod within the cluster then ./upgrade_cluster_components.sh cannot progress.

To confirm the same, run the below command:

     kubectl get apiservice v1alpha1.data.packaging.carvel.dev -o jsonpath='{.spec.caBundle}' | base64 -d | openssl x509 -text -noout

Resolution

The certificate can be renewed on restart of kapp-controller pod as follows:

  1. Connect to the Kubernetes container cluster if not done so already.
    1. Open the Cloud Director UI and select "More -> Kubernetes Container Cluster."
    2. Select the specific TKG Cluster and click the "Download Kube Config" action.
    3. Place the config file on a machine with kubectl and access to the TKG Cluster Endpoint (Load Balancer VIP).
    4. Export the kubeconfig environment variable. Example:

      export KUBECONFIG="/root/kubeconfig<cluster-name>.conf"

    5. Validate you are connected to the cluster by listing pods with kubectl.

      kubectl get pods -A

  2. Confirm the certificate validity date is expired using this command:

    kubectl get apiservice v1alpha1.data.packaging.carvel.dev -o jsonpath='{.spec.caBundle}' | base64 -d | openssl x509 -text -noout

  3. Get the name and uptime of the kapp-controller pod.

    kubectl -n kapp-controller get pods

  4. To restart the pod you can use command:

    kubectl -n kapp-controller delete pod kapp-controller-<pod-id-from-step-3-output>

  5. Once the pod is online again, confirm the certificate is renewed:

    kubectl get apiservice v1alpha1.data.packaging.carvel.dev -o jsonpath='{.spec.caBundle}' | base64 -d | openssl x509 -text -noout

  6. Run ./upgrade_cluster_components.sh again to upgrade from within the cluster.