Spring Cloud Services operation fails due to: I/O error on POST request for "https://api....": failed to respond; nested exception is org.apache.http.NoHttpResponseException
search cancel

Spring Cloud Services operation fails due to: I/O error on POST request for "https://api....": failed to respond; nested exception is org.apache.http.NoHttpResponseException

book

Article ID: 297096

calendar_today

Updated On:

Products

Support Only for Spring

Issue/Introduction

This article applies to Spring Cloud Services 1.3, 1.4, and 1.5.

Resolution

As one example, when updating a Spring Cloud Services instance (such as config server instance):

cf update-service config-server -c '{"git":{"uri": "https://xxxxx"}}'


Update fails after a few minutes, Spring Cloud Services Broker Worker reports network I/O error with /copy_bits operation against Pivotal Cloud Foundry (PCF) API endpoint (https://api.SYSTEM_DOMAIN):

2018-04-24T01:32:15.21+0200 [APP/PROC/WEB/0] OUT 2018-04-23 23:32:15.209 ERROR [spring-cloud-service-broker-worker,
dede2c16d0d80aec,dede2c16d0d80aec,false] 15 --- [cTaskExecutor-2] i.p.s.s.messaging.RequestHandler : Error updating 
service instance: org.springframework.web.client.ResourceAccessException: I/O error on POST request for 
"https://api.SYSTEM_DOMAIN/v2/apps/6fbb83d8-36ad-46cb-a8e9-47d291553c9b/copy_bits": api.SYSTEM_DOMAIN:443 
failed to respond; nested exception is org.apache.http.NoHttpResponseException: 
api.SYSTEM_DOMAIN:443 failed to respond
2018-04-24T01:32:15.21+0200 [APP/PROC/WEB/0] OUT org.springframework.web.client.ResourceAccessException: 
I/O error on POST request for "https://api.SYSTEM_DOMAIN/v2/apps/6fbb83d8-36ad-46cb-a8e9-47d291553c9b/copy_bits": 
api.SYSTEM_DOMAIN:443 failed to respond; nested exception is org.apache.http.NoHttpResponseException: 
api.SYSTEM_DOMAIN:443 failed to respond


This issue happens in environments where the the customer's Load Balancer that sits in front of TAS is configured with an aggressive http-keep-alive such as few seconds. In such environments, when the SCS Broker and Worker starts a request to PCF / TAS API endpoint using a connection in the connection pool, the connection could by accident be reset by the Load Balancer at a very high rate due to the aggressive timeout.


Usually, the SCS Broker Worker retries failed requests, it starts new connection or uses another connection in the pool immediately and the retry succeeds. However, in environments with an aggressive http-keep-alive, the /copy_bits API request can fail and cause the exception because there is no retrial for this API endpoint. It does not retry the request because this particular API endpoint generates heavy load on the Cloud Controller. As a result, cf update-service fails due to /copy_bits failure.


Increasing http-keep-alive on the Load Balancer from 1 or 2 seconds to 10 seconds can significantly reduce the connection pool I/O error. 

Additional Information:
The problem happens with any client applications using the connection pool. Applications which do not use connection pool are not impacted. http-keep-alive means connection remains open but idle between response and new request.