Customer see failed instances while upgrade-all-instances errand.
2024-08-09T21:09:23.459Z ERROR 7 --- [ry-client-nio-3] i.p.s.c.s.e.core.CloudFoundryService : Failed to upgrade service instance with id 'aaa'.
The status before upgrade was 'succeeded' with message 'create service instance completed'.
The status after upgrade was 'failed' with message 'update service instance started'.
2024-08-09T21:31:51.630Z ERROR 7 --- [ry-client-nio-1] i.p.s.c.s.e.core.CloudFoundryService : Failed to upgrade service instance with id 'bbb'.
The status before upgrade was 'succeeded' with message 'create service instance completed'.
The status after upgrade was 'failed' with message 'update service instance started'.
Basically, by increasing the number of concurrent SI upgrade, the time needed for upgrade is also increasing significantly. Spring Cloud App Broker which is used by SCS has 10 mins timeout for staging the backing application of the SI. so if staging take more than 10m, SCAB times out. This is the reason why the error message in errand logs is not making any sense. There is no error other than the timeout.
3.2.7 has the fix for the timeout issue. It uses the “service instance timeout“ in tile config which is 30mins by default. 30 mins should be more than enough for staging the backing app.