Tanzu Application Platform Installing CNRs fails due to crashing webhook
search cancel

Tanzu Application Platform Installing CNRs fails due to crashing webhook

book

Article ID: 297878

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

Affected versions: TAP v1.0.x/CNR v1.1.x
Fixed Versions: TAP v1.1.x/CNR v1.2.x

 

This problem can occur when installing Tanzu Application Platform (TAP), resulting in failing to create the cnrs PackageInstall. The cause is currently unknown, and it is not easily reproducible.

The install fails with this error:
 

Error: resource reconciliation failed: kapp: Error: waiting on reconcile packageinstall/cnrs (packaging.carvel.dev/v1alpha1) namespace: tap-install:
  Finished unsuccessfully (Reconcile failed:  (message: Error (see .status.usefulErrorMessage for details))). Reconcile failed: Error (see .status.usefulErrorMessage for details)
Error: exit status 1

 

If you describe the cnrs PackageInstall, you may see something like:

kapp: Error: waiting on reconcile deployment/webhook (apps/v1) namespace: vmware-sources:
  Finished unsuccessfully (Deployment is not progressing: ProgressDeadlineExceeded (message: ReplicaSet "webhook-67d45746b5" has timed out progressing.))

 


The main problem is that the webhook pod in the vmware-sources namespace is failing. If you use kubectl describe on the pod, you likely will see:
Warning  Unhealthy  10m (x3 over 10m)     kubelet            Liveness probe failed: Get "https://192.168.74.56:8443/": remote error: tls: unrecognized name
  Normal   Killing    10m                   kubelet            Container webhook failed liveness probe, will be restarted
  Warning  Unhealthy  10m (x17 over 10m)    kubelet            Readiness probe failed: Get "https://192.168.74.56:8443/": remote error: tls: unrecognized name
  Warning  BackOff    25s (x29 over 7m31s)  kubelet            Back-off restarting failed container

 

If you check the logs for the webhook pod, you likely will see this error message:

2022/03/10 09:37:04 http: TLS handshake error from 192.168.107.187:55328: tls: no certificates configured
{"severity":"WARNING","timestamp":"2022-03-10T09:37:05.109940482Z","logger":"webhook","caller":"webhook/webhook.go:156","message":"server key missing","commit":"a3e27bc"}
2022/03/10 09:37:05 http: TLS handshake error from 192.168.107.187:55338: tls: no certificates configured


 


Environment

Product Version: 1.0

Resolution

This problem can be addressed by overlaying the webhook Deployment. TAP allows you to apply custom overlays to your install.

This overlay makes the liveness and readiness probes for the webhook less aggressive, giving more time for the webhook processes to come up. This will stop the endless CrashLoopBackoff.

Steps:

  1. Create a yaml file containing the following overlay.
#@ load("@ytt:overlay", "overlay")

#@overlay/match by=overlay.subset({"kind":"Deployment", "metadata":{"name":"webhook", "namespace":"vmware-sources"}})
---
spec:
  template:
    spec:
      containers:
      #@overlay/match by=overlay.all
      -
        #@overlay/match
        readinessProbe:
          #@overlay/remove
          periodSeconds:
          
          #@overlay/match missing_ok=True
          failureThreshold: 6
          #@overlay/match missing_ok=True
          initialDelaySeconds: 20
        #@overlay/match
        livenessProbe:
          #@overlay/remove
          periodSeconds:
          
          #@overlay/match missing_ok=True
          failureThreshold: 6
          #@overlay/match missing_ok=True
          initialDelaySeconds: 20


 

  1. Create a secret with the overlay as the data.
kubectl -n tap-install create secret generic
cnrs-overlay-vmware-sources-webhook \
  -o yaml \
  --dry-run=client \
  --from-file=<YOUR_OVERLAY_FILE_FROM_STEP_1.yaml \
  | kubectl apply -f-

 

 

  1. Edit your tap-values.yaml to tell TAP to use the overlay secret you've created by adding the package_overlays section
...
package_overlays:
- name: cnrs
  secrets:
  - name: cnrs-overlay-vsphere-liveness-probe
...

 

  1. Update your TAP install
tanzu package installed update tap -n tap-install -v 1.0.0 -f tap-values.yaml