This article attempts to help resolve issues where you are seeing connection timeouts or thread exhaustion.
There are many causes of such issues, but this article touches upon the most common causes being seen with VMware GemFire. Customers often open tickets with symptoms, such as seeing increased client timeout issues or thread exhaustion on the server side.
These symptoms are often resolved simply with a more optimal configuration across both VMware GemFire and your network configuration settings.
One possible symptom is the thread exhaustion log message:
[warning 2021/03/22 13:52:36.302 EDT xxx <Handshaker 0.0.0.0/0.0.0.0:40404 Thread 1> tid=0x53] Rejected connection from Server connection from [client host address=xxx; client port=xxx] because incoming request was rejected by pool possibly due to thread exhaustion =============== You may also see examples such as these in your logs: [info 2021/03/22 13:47:58.751 EDT xxx<disconnect thread for xxx(xxx:xxx)<v6>:41000> tid=0x741] Timed out waiting for readerThread on xxx(xxx:xxx)<v6>:41000@1418(GEODE 1.8.0) to finish. [warning 2021/03/22 13:47:58.782 EDT cache5 <ClientHealthMonitor Thread> tid=0x64] Server connection from [identity(xxx(15:loner):52392:xxx,connection=1; port=34810] is being terminated because its client timeout of 3000 has expired. ==========
The above client timeout is the client side pool read-timeout value. Increasing this may be sufficient in some cases. The default is 10,000 ms (10 seconds).
However, when the system is getting overwhelmed by current attempts to connect, you may need to alter more of your configuration.
These symptoms are caused by an insufficient accept queue at the TCP layer. Furthermore, when combined with a burst of client to server connection activity, these symptoms may be seen.
Follow the checklist included above, but to summarize: