Troubleshooting eviction of services in Spring Cloud Service - Service Registry
search cancel

Troubleshooting eviction of services in Spring Cloud Service - Service Registry

book

Article ID: 297181

calendar_today

Updated On:

Products

Support Only for Spring

Issue/Introduction

Here are some troubleshooting steps and also some tips when you encounter unexpected eviction of services when using Spring Cloud Service - Service Registry.

Environment

Product Version: 3.1

Resolution

Network level connection to peer
Not re-trying this exception because it does not seem to be a network exception
It seems to be a socket read timeout exception, it will retry later. if it continues to happen and some eureka node occupied all the cpu time, you should set property 'eureka.server.peer-node-read-timeout-ms' to a bigger value
Batch update failure with HTTP status code
Server busy (503) HTTP status code received from the peer
  • Service Registry might be overloaded and is not performing well. The recommended maximum number of services per SCS ServiceRegistry instance is 250
  • Tuning some parameters based on your network/environment behaviours: 
    • eureka.server.peer-node-read-timeout-ms : this is set to 200ms by default. This parameter tells how long a peer should wait for other peers to respond. 
    • eureka.server.eviction-interval-timer-in-ms : this is set to 60000ms by default. This parameter tells the Eureka server to run a job at this frequency to evict the expired clients.
cf target -o p-spring-cloud-services -s <service-registry-si-guid>
cf set-env service-registry EUREKA_SERVER_EVICTION_INTERVAL_TIMER_IN_MS <value>
cf restage service-registry

  • Put the backing SCS Service Instance backing application logs to debug using cf cli or via Apps Manager  to give more info to investigate service eviction further. Here are some sample useful info
    • socket connection logs
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] o.a.h.c.ssl.SSLConnectionSocketFactory   : Secure session established
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] o.a.h.c.ssl.SSLConnectionSocketFactory   :  negotiated protocol: TLSv1.2
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] o.a.h.c.ssl.SSLConnectionSocketFactory   :  negotiated cipher suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
      
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] o.a.h.c.ssl.SSLConnectionSocketFactory   :  peer principal: CN=XXXX, O="XXX", L=XX, ST=XX, C=XX
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] o.a.h.c.ssl.SSLConnectionSocketFactory   :  issuer principal: CN=XXX, OU=www.digicert.com, O=DigiCert Inc, C=US
    • applications that are on replicationList and their status. You can see check if heartbeats from services were received
      APP/PROC/WEB/0	2023-06-20 13:26:00.279 DEBUG 17 --- [s.xxxx.xxx-3] org.apache.http.wire                     :  >> "{"replicationList":[{"appName":"APP03","id":"app03.XXXXX:1b114841-2d6c-474b-7c01-f66a","lastDirtyTimestamp":1686843728305,"status":"UP","action":"Heartbeat"},{"appName":"APP01","id":"app01.XXXXX:6cc26a62-fb71-4b90-556a-618b","lastDirtyTimestamp":1686821927621,"status":"UP","action":"Heartbeat"},{"appName":"APP02","id":"app02.apps.pkmes.posco.co.kr:16145e3e-226f-4e6e-77ff-82f9","lastDirtyTimestamp":1686616871480,"status":"UP","action":"Heartbeat"},{"appName":"APP04","id":"app04.XXXX:54b26e4e-543b-435a-59a5-d155","lastDirtyTimestamp":1687122092038,"status":"UP","action":"Heartbeat"}]}"

       
    • HTTP Requests received and HTTP Responses provided. Sample below shows peerreplication requests and successful response from peer 
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] o.a.h.impl.conn.DefaultClientConnection  : Sending request: POST /eureka/peerreplication/batch/ HTTP/1.1
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "POST /eureka/peerreplication/batch/ HTTP/1.1[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "Accept: application/json[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "Content-Type: application/json[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "Authorization: Bearer XXXX[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "DiscoveryIdentity-Name: DefaultServer[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "DiscoveryIdentity-Version: 1.0[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "DiscoveryIdentity-Id: 10.255.141.116[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "Accept-Encoding: gzip[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "Transfer-Encoding: chunked[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "Host: service-registry-989def60-9791-47c2-af2d-59408b5b855a.apps.xx[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "Connection: Keep-Alive[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xxr-12] org.apache.http.wire                     :  >> "User-Agent: Java-EurekaClient-Replication/v1.10.11[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "X-dynaTrace: FW4;585699423;1;462174906;19388621;0;1883469856;294;3cef;2h01;3h1b8c3aba;4h0127d8cd;5h01[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xx-12] org.apache.http.wire                     :  >> "traceparent: 00-1c1f21d55c484bb216ba896b499514c9-c441a5b9456dae70-01[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.424 DEBUG 24 --- [.xxr-12] org.apache.http.wire                     :  >> "tracestate: 70437820-22e9105f@dt=fw4;1;1b8c3aba;127d8cd;0;0;0;126;3cef;2h01;3h1b8c3aba;4h0127d8cd;5h01[\r][\n]"
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] o.a.h.impl.conn.DefaultClientConnection  : Receiving response: HTTP/1.1 200 OK
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << HTTP/1.1 200 OK
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << Cache-Control: no-cache, no-store, max-age=0, must-revalidate
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << Content-Length: 37
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << Content-Type: application/json
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << Date: Tue, 20 Jun 2023 04:25:59 GMT
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << Expires: 0
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << Pragma: no-cache
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xxr-1] org.apache.http.headers                  : << Strict-Transport-Security: max-age=31536000 ; includeSubDomains
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xxr-1] org.apache.http.headers                  : << X-Content-Type-Options: nosniff
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << X-Frame-Options: DENY
      APP/PROC/WEB/2	2023-06-20 13:26:00.426 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << X-Vcap-Request-Id: 02a46903-5ad7-4f50-565e-419a0dd5aef9
      APP/PROC/WEB/2	2023-06-20 13:26:00.427 DEBUG 24 --- [xx-1] org.apache.http.headers                  : << X-Xss-Protection: 1; mode=block
      APP/PROC/WEB/2	2023-06-20 13:26:00.427 DEBUG 24 --- [xx-1] o.a.http.impl.client.DefaultHttpClient   : Connection can be kept alive indefinitely
      APP/PROC/WEB/2	2023-06-20 13:26:00.427 DEBUG 24 --- [xx-1] org.apache.http.wire                     :  << "{"responseList":[{"statusCode":200}]}"