Intermittent UMP timeout/failure issues after upgrade

book

Article ID: 198554

calendar_today

Updated On:

Products

NIMSOFT PROBES DX Infrastructure Management

Issue/Introduction

Since we upgraded from UIM v 8.51 > 9.20 > 20.10, we have been seeing frequent (several times a day -- we have received multiple complaints from our Operations team and clients with web portal access) intermittent issues with the UMP. The page will fail to load, often requiring a restart of the web server's Nimsoft robot.

UIM v 9.x or higher, 20.10
Hub v 9.30
Robot v 9.30
wasp probe v9.20/20.10

Cause

- backend DB issue/SQL Server resources

Environment

Release : 20.1

Component : UIM - UMP

Resolution

Based on the logs and results,

SQL state [S0001]; error code [8645]; A timeout occurred while waiting for memory resources to execute the query in resource pool 'default' (2). Rerun the query.; nested exception is com.microsoft.sqlserver.jdbc.SQLServerException: A timeout occurred while waiting for memory resources to execute the query in resource pool 'default' (2). Rerun the query.

it appears that there was a backend DB issue with a long running query exhausting available memory/memory buffer, e.g., backups.

UIM DB-> MS SQL Server 2014 Enterprise 64-bit

Please have your DBA check for memory pressure and adjust/increase memory dedicated to SQL Server if necessary.

SQL Server Profiler will tell you which query is taking up the most memory/resources at the time in which you are seeing intermittent connectivity or failures in UMP/UMP response.

You might want to also check the MS SQL Server transaction logs during the time of the issue with intermittent response from UMP.

References:
http://udayarumilli.com/script-to-monitor-sql-server-memory-usage/

https://www.brentozar.com/blitz/memory-grants/

You can run queries like the following but a DBA should test for 'memory pressure.'

SELECT type, SUM(multi_pages_kb)
FROM sys.dm_os_memory_clerks
WHERE multi_pages_kb <> 0
GROUP BY type
ORDER BY SUM(multi_pages_kb) DESC

SELECT type,
SUM(single_pages_kb) as [Single Pages],
SUM(multi_pages_kb) as [Multi Pages]
FROM sys.dm_os_memory_clerks
GROUP BY type

Also, min/max for java memory should never be more than 2GB diff. We noticed that wasp was set to 4GB ad 16GB respectively. If wasp can use up to 16 GB then the min should be set to 14GB to minimize garbage collection.