apply-addons job fails with exit code 1 and "Internal error occurred: failed calling webhook" and Gatekeeper service "not found"
search cancel

apply-addons job fails with exit code 1 and "Internal error occurred: failed calling webhook" and Gatekeeper service "not found"


Article ID: 298700


Updated On:


VMware Tanzu Kubernetes Grid Integrated Edition


While running a Bosh deployment, such as during a cluster upgrade or other deployment update, the apply-addons job returns an exit code 1 error.

When looking at the bosh task debug output you see the following:

bosh task XXXX --debug

From the:

You see:
Internal error occurred: failed calling webhook
service \"<GATEKEEPER_WEBHOOK_SERVICE_NAME>\" not found"

And if you run the apply-addon errand manually while keeping the apply-addon errand VM you will see a more human-readable error output similar to below:

EXAMPLE COMMAND: to keep the apply-addons errand after running:
bosh -d CLUSTER_SERVICE_INSTANCE run-errand apply-addons --keep-alive

Instance   apply-addons/<BOSH_INSTANCE_ID>
Exit Code  1
           failed to start all system specs after 1200 with exit code 1

Stderr     Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
           Error from server (InternalError): error when applying patch:
           Resource: "/v1, Resource=namespaces", GroupVersionKind: "/v1, Kind=Namespace"
           Name: "pks-system", Namespace: ""
           for: "/var/vcap/jobs/apply-specs/specs/addon-spec.yml": Internal error occurred: failed calling webhook "": Post https://gatekeeper-webhook-service.gatekeeper-system.svc:443/v1/admitlabel?timeout=3s: service "gatekeeper-webhook-service" not found

This can happen in instances such as removing Gatekeeper from an internal Concourse Pipeline but the pipeline does not remove additional Gatekeeper objects, such as:
A PodSecurityPolicy for Gatekeeper: example: gatekeeper-admin
A validatingwebhookconfiguration for Gatekeeper
A clusterrole for Gatekeeper
A clusterrolebinding for Gatekeeper


Product Version: 1.13


Identify orphaned objects with commands such as the ones below:

kubectl get psp -A | grep gatekeeper
kubectl get validatingwebhookconfiguration -A | grep gatekeeper
kubectl get clusterrole -A | grep gatekeeper
kubectl get crb -A | grep gatekeeper

Remove the orphaned Gatekeeper objects:

kubectl delete psp XXX
kubectl delete validatingwebhookconfiguration XXX
kubectl delete clusterrole XXX
kubectl delete crb XXX