ALERT: Some images may not load properly within the Knowledge Base Article. If you see a broken image, please right-click and select 'Open image in a new tab'. We apologize for this inconvenience.

Coordinator stops responding - OutOfMemoryError

book

Article ID: 133637

calendar_today

Updated On:

Products

CA Application Test

Issue/Introduction

I am running into intermittent instances where CVS tests stop running.  Currently Daily over the last 3 days.  There has been no environmental changes.

I have traced this back to the fact that the Coordinator appears to have stopped responding.

The coordinator count in the Portal/Server Health window shows 0 coordinators.

The coordinator java process IS still running.

The coordinator log is showing OutOfMemoryErrors.

I will upload the coordinator logs.



Cause

 SEVERE: Could not accept connection : java.net.SocketException: Connection reset

2019-06-14 06:37:56,818Z (02:37) [Event Sink Thread Pool Thread 5] INFO com.itko.lisa.stats.MetricControllerImpl - Error retrieving metric

java.lang.IllegalStateException: Could not put anything new on the event queue

2019-06-14 14:54:14,447Z (10:54) [amq dbwriter #735 for queue reporting_735 report USPS_DB 2019-06-14 05:52:43,724Z (01:52) [Event Sink Thread Pool Thread 2] INFO com.itko.lisa.stats.MetricControllerImpl - Error retrieving metric

Environment

Release : 10.1

Component : CA Application Test

Resolution

Add these two properties to the local.properties on the DevTest 10.3.0 Coordinator machine:

lisa.eventPool.maxQueueSize=131070

lisa.pathfinder.on=false

lisa.jdbc.pool.maxPoolSize=25 ( default is 10) and Restart Registry , coordinator and Simulator to pick up the new properties.

The timeout issue is not really a bug, it is an indication that the system is overloaded.

When using connection pooling for load tests (multi-VUs), you may need to configure the lisa.jdbc.pool.maxPoolSize property not to run out of connections (starvation),

 Updating the lisa.eventPool.maxQueueSize will not fix the problem but will provide more resources to the system so that the timeout errors are delayed.

Also, Check that there is sufficient space in the Registry database and that there is no connection problem accessing this database.