When scaling a Spring Boot Application that has been deployed in Cloud Foundry / TAS, the application will intermittently not start due to a failed health check. The following error message will be returned:
2019-05-13T10:31:58.80-0500 [HEALTH/1] ERR Failed to make HTTP request to '/' on port 8080: connection refused 2019-05-13T10:31:58.80-0500 [CELL/1] ERR Timed out after 3m0s: health check never passed. 2019-05-13T10:31:58.81-0500 [CELL/SSHD/1] OUT Exit status 0 2019-05-13T10:31:59.47-0500 [APP/PROC/WEB/1] OUT Exit status 143 2019-05-13T10:32:00.45-0500 [HEALTH/3] ERR Failed to make HTTP request to '/' on port 8080: connection refused 2019-05-13T10:32:00.45-0500 [CELL/3] ERR Timed out after 3m0s: health check never passed.
The config-server instance will reach out to the upstream git-repo each time it handles a large number of requests. If the service instance is being used by a large number of apps, there is an increase in response time for each request.
As the response time becomes slower, the config-server logs will display the following message:
#capture the logs from the backend config-server app located in ORG p-spring-cloud-services SPACE instances APP config-<SERVICE-GUID> $ grep response config-5ec08796-d1f1-42bc-8919-784750828920.txt | grep response 2019-05-14T11:28:55.94-0500 [RTR/5] OUT config-5ec08796-d1f1-42bc-8919-784750828920.apps.SYSTEM-DOMAIN - [2019-05-14T16:28:10.383+0000] "GET /user-service/events,prod,cloud HTTP/1.1" 200 0 29418 "-" "Java/1.8.0_181" "10.242.8.5:36832" "10.242.100.70:61026" x_forwarded_for:"10.242.100.43, 10.242.8.5" x_forwarded_proto:"https" vcap_request_id:"56450963-55e6-4c71-5886-c3cdfe1726c4" response_time:45.56055492 app_id:"cf94c0d2-92ba-4bda-8c69-7b5dea348229" app_index:"0" x_b3_traceid:"951910efd4851046" x_b3_spanid:"951910efd4851046" x_b3_parentspanid:"-" 2019-05-14T11:28:57.06-0500 [RTR/4] OUT config-5ec08796-d1f1-42bc-8919-784750828920.apps.SYSTEM-DOMAIN - [2019-05-14T16:28:09.717+0000] "GET /inbound-email-service/events,prod,cloud HTTP/1.1" 200 0 29249 "-" "Java/1.8.0_181" "10.242.8.4:37411" "10.242.100.70:61026" x_forwarded_for:"10.242.100.50, 10.242.8.4" x_forwarded_proto:"https" vcap_request_id:"32f5716c-3aaa-4856-7209-3148c4d7e19d" response_time:47.346539979 app_id:"cf94c0d2-92ba-4bda-8c69-7b5dea348229" app_index:"0" x_b3_traceid:"20fc410019567f08" x_b3_spanid:"20fc410019567f08" x_b3_parentspanid:"-" 2019-05-14T11:28:58.19-0500 [RTR/0] OUT config-5ec08796-d1f1-42bc-8919-784750828920.apps.SYSTEM-DOMAIN - [2019-05-14T16:28:08.764+0000] "GET /doc-metadata-service/events,prod,cloud HTTP/1.1" 200 0 29488 "-" "Java/1.8.0_181" "10.242.8.6:47122" "10.242.100.70:61026" x_forwarded_for:"10.242.100.67, 10.242.8.6" x_forwarded_proto:"https" vcap_request_id:"a207659c-519f-4198-5042-c4a30feb91b3" response_time:49.433517341 app_id:"cf94c0d2-92ba-4bda-8c69-7b5dea348229" app_index:"0" x_b3_traceid:"6141f74968a66190" x_b3_spanid:"6141f74968a66190" x_b3_parentspanid:"-"
Eventually, it will timeout and the health check will fail. When a client requests configuration, the config-server repository clone updates every time it reaches out to the git-repo.
In Spring Cloud Services v2.0.x, the waiting interval refresh rate (in seconds) for the config-server repository clone is specified by refreshRate. The default value is 0. Refer to Configuring with Git for more information.
Define a config-server service instance with a refreshRate
value greater than zero and bind it to the app.
cf create-service p-config-server standard config-server -c '{"git": { "uri": "https://github.com/myorg/config-repo", "refreshRate": 60 } }'
Note: If you are running Spring Cloud Services v1.5.x, you have to upgrade to v2.0.x. Then recreate the service instance and bind it to the application.