BOSH Tasks stuck in queued state after AWS Automated RDS MySQL upgrade
search cancel

BOSH Tasks stuck in queued state after AWS Automated RDS MySQL upgrade

book

Article ID: 394373

calendar_today

Updated On:

Products

VMware Tanzu Application Service

Issue/Introduction

Output of the following command shows bosh tasks are in queued state.  If you have HealthWatch installed, you may see a new queued task every 10 minutes when the bosh deployment check runs. 

bosh tasks -r=100

 

On the director you will see database Connection errors in log file  /var/vcap/sys/log/director/worker_1.stdout.log

ERROR -- Director: Sequel::DatabaseConnectionError - Mysql2::Error::ConnectionError: Can't connect to server on 'xxxxx-1.rds.AMAZON.RDS.DOMAIN' (115):

 

 

 

Cause

The BOSH director can get into this state when the external MySQL instance goes down.  It does not happen every time, but when it happens new BOSH tasks will not be able to progress. 

Resolution

This is a known issue where the worker gets stalled; it is fixed in Operations Manager 3.0.38.

As a workaround you can simply restart the director services:

https://techdocs.broadcom.com/us/en/vmware-tanzu/platform/tanzu-operations-manager/3-0/tanzu-ops-manager/install-ssh-login.html#bosh-director-ssh

After restarting director services, restart all the monit services on the director VM:

bosh/0:~$ sudo su -
bosh/0:~# monit restart all