After Rotating Expired bosh-dns Leaf Certificates, Using bosh deploy Fails with 'Stopping Monitored Services: Stopping services [bosh-dns bosh-dns-healthcheck etc] errored
search cancel

After Rotating Expired bosh-dns Leaf Certificates, Using bosh deploy Fails with 'Stopping Monitored Services: Stopping services [bosh-dns bosh-dns-healthcheck etc] errored

book

Article ID: 385640

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

After regenerating expired bosh-dns leaf certificates, attempting to execute "bosh -d service_instance_xxx deploy manifest.yaml --skip-drain --fix" to push the new certificates may result in the following error:

Error: Action Failed get_task: Task d4738ucb-50e8-4b8b-6f61-7838j93f36c result: Stopping Monitored Services: Stopping services '[kube-apiserver kube-controller-manager kube-scheduler bosh-dns bosh-dns-healthcheck ]' errored

 

 

Cause

The exact cause of this issue is unclear; however, it is possible that some services are in an error state due to expired certificates, preventing them from fully stopping, even though monit indicates that they have been stopped.

Resolution

In this case, SSH into the node that failed. When you run the monit summary command, you’ll notice that all services are in a 'not monitored' state. Try executing monit unmonitor all, and once that’s completed, attempt to deploy the cluster again. This time, the upgrade will likely succeed. If necessary, repeat the steps until all VMs have been updated with the new certificates.