When upgrading Tanzu Hub or applying a configuration change, the BOSH task appears to hang for an extended period.
To confirm, get a list of all running BOSH tasks and identify the one in a processing state.
bosh tasks
Example Output:
Using environment '10.###.###.###' as client 'ops_manager'
ID State Started At Finished At User Deployment Description Result
215 processing Tue Sep 30 08:40:43 UTC 2025 - ops_manager hub-#################### create deployment
Inspect the events for the processing task to find which instance group is stuck. You will see an instance group stuck in the pre-stop phase.
# Check all events for the stuck task ID
bosh task 215 --event | grep -n -C 3 '"stage":"Updating instance"'
# Or filter by a specific instance group, for example 'control'
bosh task 215 --event | grep -n -C 3 '"stage":"Updating instance"' | grep control
Example Output:
151:{"time":1759222920,"stage":"Updating instance","tags":["control"],"total":3,"task":"control/########-####-####-####-############ (1)","index":2,"state":"started","progress":0}
152:{"time":1759222921,"stage":"Updating instance","tags":["control"],"total":3,"task":"control/########-####-####-####-############ (1)","index":2,"state":"in_progress","progress":5,"data":{"status":"executing pre-stop"}}
SSH into the stuck instance VM to investigate further.
bosh ssh control/########-####-####-####-########### -d hub-####################
Check the drain.stderr.log, which will show that a pod cannot be evicted due to its PodDisruptionBudget (PDB).
less /var/vcap/sys/log/kubelet/drain.stderr.log
Example Log Output:
error when evicting pods/"contour-envoy-############-#####" -n "tanzusm" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
During an upgrade or deployment of a configuration change, BOSH attempts to safely drain Kubernetes nodes before updating them. This process can get stuck if a pod on the node cannot be evicted. The eviction fails because of a PodDisruptionBudget (PDB), which enforces a minimum availability for a set of pods.This situation arises under two primary conditions:
Workaround
To allow the BOSH drain process to complete, you can temporarily delete the PodDisruptionBudgets from the tanzusm namespace. PodDisruptionBudgets will automatically be created later on after installation.
bosh instances -d hub-####################
bosh ssh registry/########-####-####-####-############ -d hub-####################
poddisruptionbudgets.yaml):kubectl get pdb -n tanzusm -o yaml > poddisruptionbudgets.yaml
kubectl delete pdb --all -n tanzusm
After the PDBs are deleted, the BOSH task should proceed and complete successfully
Solution
This issue is resolved in Tanzu Hub version 10.3 or later. Upgrading to Tanzu Hub 10.3 will prevent the problem from occurring in subsequent configuration updates and upgrades.