General steps on how to troubleshoot if you are having issue with creating, updating, deleting, binding, unbinding Scheduler service instances.
An example of provisioning issues discussed on this KB
cf unbind-service <app> <scheduler-service-instance>
Fails with message "failed to delete application from Scheduler API: got status code 404"
When Scheduler tile install errand is ran, 2 applications are deployed namely: scheduler and scheduler-broker.
Here are some steps you can use to troubleshoot provisioning issues:
1. Check if the app scheduler and scheduler-broker are running.
cf target -o system -s scheduler
cf apps
alternatively you can also use apps manager
2. Check the 2 application logs for warnings/error logs
cf target -o system -s scheduler
cf logs scheduler --recent
cf logs scheduler-broker --recent
on our example we can see that the scheduler app is down so looking further into the logs
2025-02-03T00:44:19.825-05:00 [APP/PROC/WEB/4] [OUT] Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'flywayInitializer' defined in class path resource [org/springframework/boot/autoconfigure/flyway/FlywayAutoConfiguration$FlywayConfiguration.class]: Unable to obtain connection from database: Communications link failure
2025-02-03T00:44:19.825-05:00 [APP/PROC/WEB/4] [OUT] The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
2025-02-03T00:44:19.825-05:00 [APP/PROC/WEB/4] [OUT] --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2025-02-03T00:44:19.825-05:00 [APP/PROC/WEB/4] [OUT] SQL State : 08S01
2025-02-03T00:44:19.825-05:00 [APP/PROC/WEB/4] [OUT] Error Code : 0
2025-02-03T00:44:19.825-05:00 [APP/PROC/WEB/4] [OUT] Message : Communications link failure
2025-02-03T00:44:19.825-05:00 [APP/PROC/WEB/4] [OUT] The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
Although the scheduler-broker is running, check if there might be errors/logs that would provide clues. In this case we can see a mysql error as well. Scheduler-broker and Scheduler apps work hand in hand so one app will affect another app's behaviour.
2025-02-02T23:17:02.283-05:00 [RTR/4] [OUT] scheduler.######.com - [2025-02-03T04:17:02.067246183Z] "DELETE /v2/service_instances/########/service_bindings/
########?accepts_incomplete=true&plan_id=#####&service_id=##### HTTP/1.1" 500 0 61 "-" "HTTPClient/1.0 (2.8.3, ruby 3.2.2 (2023-03-30))" "10.###.###.##:24344" "10.###.##.##:61002" x_forwarded_for:"10.###.##.##" x_forwarded_proto:"https" ...
2025-02-02T23:25:03.690-05:00 [APP/PROC/WEB/2] [ERR] [mysql] 2025/02/03 04:25:03 connection.go:49: closing bad idle connection: unexpected read from socket
3. From the logs on Step 2, we can see that there are issues with mysql. Next step is to check the health of the MySQL service instances that the scheduler apps are using.
cf target -o system -s scheduler
cf services
Getting services in org system / space scheduler as admin...
name service plan bound apps last operation broker upgrade available
scheduler-broker-mysql p.mysql db-small scheduler-broker create succeeded dedicated-mysql-broker no
scheduler-mysql p.mysql db-small scheduler create succeeded dedicated-mysql-broker no
We need to the service instances guid so we can check the health of the vms on each of those service instances deployments
cf service scheduler-broker-mysql --guid
cf service scheduler-mysql --guid
4. Using the guid from Step 3, open a bosh CLI
bosh -d service-instance_<scheduler-broker-mysql-guid> vms
bosh -d service-instance_<scheduler-mysql-guid> vms
check for any failing vms. In this example, we can see that the scheduler-mysql is in a failing state
Deployment 'service-instance_<scheduler-mysql-guid>'
Instance Process State AZ IPs VM CID VM Type Active Stemcell VM Created At Uptime Load CPU CPU CPU CPU Memory Swap System Ephemeral Persistent
(1m, 5m, 15m) Total User Sys Wait Usage Usage Disk Usage Disk Usage Disk Usage
mysql/####-#####-##### failing us-east-1b 10.###.###.## i-############ t3a.large true bosh-aws-xen-hvm-ubuntu-jammy-go_agent/1.708 Wed Jan 22 20:47:05 UTC 2025 7d 6h 10m 2s 0.01, 0.04, 0.01 - 1.7% 1.1% 0.0% 18% (1.5 GB) 17% (1.4 GB) 57% (24i%) 10% (1i%) 2% (0i%)
The scheduler apps cannot function well if mysql service instances are not on a healthy state. You may refer here for troubleshooting MySQL.
5. Once the MySQL are in a healthy state, check for the apps health again. On some instances like our example issue discuss here, you might need to restart the app to refresh connections to MySQL. Once the app and MySQL service instances are healthy, you might want to test/retry provisioning again.
cf target -o system -s scheduler
cf apps
Getting apps in org system / space scheduler as admin...
name requested state processes routes
scheduler started web:3/3, task:0/0 scheduler.########
scheduler-broker started web:3/3 scheduler.########