Affected versions: TAP v1.0.x/CNR v1.1.x
Fixed Versions: TAP v1.1.x/CNR v1.2.x
This problem can occur when installing Tanzu Application Platform (TAP), resulting in failing to create the cnrs
PackageInstall. The cause is currently unknown, and it is not easily reproducible.
The install fails with this error:
Error: resource reconciliation failed: kapp: Error: waiting on reconcile packageinstall/cnrs (packaging.carvel.dev/v1alpha1) namespace: tap-install: Finished unsuccessfully (Reconcile failed: (message: Error (see .status.usefulErrorMessage for details))). Reconcile failed: Error (see .status.usefulErrorMessage for details) Error: exit status 1
If you describe the cnrs
PackageInstall, you may see something like:
kapp: Error: waiting on reconcile deployment/webhook (apps/v1) namespace: vmware-sources: Finished unsuccessfully (Deployment is not progressing: ProgressDeadlineExceeded (message: ReplicaSet "webhook-67d45746b5" has timed out progressing.))
The main problem is that the webhook pod in the
vmware-sources
namespace is failing. If you usekubectl describe
on the pod, you likely will see:
Warning Unhealthy 10m (x3 over 10m) kubelet Liveness probe failed: Get "https://192.168.74.56:8443/": remote error: tls: unrecognized name Normal Killing 10m kubelet Container webhook failed liveness probe, will be restarted Warning Unhealthy 10m (x17 over 10m) kubelet Readiness probe failed: Get "https://192.168.74.56:8443/": remote error: tls: unrecognized name Warning BackOff 25s (x29 over 7m31s) kubelet Back-off restarting failed container
If you check the logs for the webhook pod, you likely will see this error message:
2022/03/10 09:37:04 http: TLS handshake error from 192.168.107.187:55328: tls: no certificates configured {"severity":"WARNING","timestamp":"2022-03-10T09:37:05.109940482Z","logger":"webhook","caller":"webhook/webhook.go:156","message":"server key missing","commit":"a3e27bc"} 2022/03/10 09:37:05 http: TLS handshake error from 192.168.107.187:55338: tls: no certificates configured
This problem can be addressed by overlaying the webhook
Deployment. TAP allows you to apply custom overlays to your install.
This overlay makes the liveness and readiness probes for the webhook less aggressive, giving more time for the webhook processes to come up. This will stop the endless CrashLoopBackoff.
Steps:
#@ load("@ytt:overlay", "overlay") #@overlay/match by=overlay.subset({"kind":"Deployment", "metadata":{"name":"webhook", "namespace":"vmware-sources"}}) --- spec: template: spec: containers: #@overlay/match by=overlay.all - #@overlay/match readinessProbe: #@overlay/remove periodSeconds: #@overlay/match missing_ok=True failureThreshold: 6 #@overlay/match missing_ok=True initialDelaySeconds: 20 #@overlay/match livenessProbe: #@overlay/remove periodSeconds: #@overlay/match missing_ok=True failureThreshold: 6 #@overlay/match missing_ok=True initialDelaySeconds: 20
kubectl -n tap-install create secret generic cnrs-overlay-vmware-sources-webhook \ -o yaml \ --dry-run=client \ --from-file=<YOUR_OVERLAY_FILE_FROM_STEP_1.yaml \ | kubectl apply -f-
package_overlays
section... package_overlays: - name: cnrs secrets: - name: cnrs-overlay-vsphere-liveness-probe ...
tanzu package installed update tap -n tap-install -v 1.0.0 -f tap-values.yaml