Restarted both mon_config_service probes on the Primary UIM server but the issue persisted and the profiles kept spiining, were hung.
The customer would like to DISABLE the SQL Job Monitor and DELETE the MSExchangeTransport profiles but they could not due to this hang.
Repeated SQL Server errors in the mon_config_service.log:
org.springframework.dao.DataIntegrityViolationException: PreparedStatementCallback; Arithmetic overflow error converting IDENTITY to data type int.; nested exception is com.microsoft.sqlserver.jdbc.SQLServerException: Arithmetic overflow error converting IDENTITY to data type int.
and...
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Arithmetic overflow error converting IDENTITY to data type int.
and...
mon_config_service] TimeController$2.call:817: Device with device_id 2063 is currently locked by another process. Skipping it.
DECLARE @sql NVARCHAR(MAX) = '';
-- Generate dynamic SQL to check the maximum value of each identity column
SELECT @sql = @sql + 'SELECT ''' + t.name + ''' AS TableName,
MAX(' + c.name + ') AS MaxValue
FROM ' + t.name + '
WHERE ' + c.name + ' IS NOT NULL
HAVING MAX(' + c.name + ') > 2147483647;
'
FROM sys.tables t
JOIN sys.columns c ON t.object_id = c.object_id
WHERE c.is_identity = 1;
-- Execute the dynamic SQL
EXEC sp_executesql @sql;
-- Find problematic profiles
The following query will find "problematic" profiles and will also provide the latest MCS audit trail message for these profiles in the outputs generated:
SELECT ccs.name, b.cs_id, b.status, b.profileid, b.ancestorprofile, a.id, a.timestamp, a.objectname, a.objectvalue, a.action
FROM SSRV2AuditTrail a
JOIN SSRV2Profile b
ON a.objectid = b.profileId
JOIN (SELECT objectid, max(id) AS id FROM SSRV2AuditTrail GROUP BY objectid) c
ON a.id = c.id AND a.objectid = c.objectid
JOIN CM_COMPUTER_SYSTEM ccs
ON b.cs_id = ccs.cs_id
WHERE b.status NOT IN ('ok', 'new', 'suspended')
ORDER BY name;
1:-SQL Server logs:
Contains information about the error (Arithmetic overflow error). This helped us identify when and where the error was triggered in the SQL Server logs.
The following query to find related error messages:
SELECT *
FROM sys.messages
WHERE message_id = 8134;
-- This is the error ID for "Arithmetic overflow error"
2: To identify which table is causing the "Arithmetic overflow error converting IDENTITY to data type int",
Use the following query to list tables with IDENTITY columns and their current seed values:
SELECT
t.name AS TableName,
c.name AS IdentityColumnName,
i.seed_value AS SeedValue,
i.increment_value AS IncrementValue,
(SELECT IDENT_CURRENT(t.name)) AS CurrentIdentity
FROM
sys.tables t
JOIN
sys.columns c ON t.object_id = c.object_id
JOIN
sys.identity_columns i ON c.object_id = i.object_id
WHERE
c.is_identity = 1;
Engineering analyzed the file and identified the table which is causing the issue.
Table: SSRV2ObjectLock
Column: id
Current value: -2147483647
2147483647 Is the maximum size for INT so somewhere the IDENTITY column value is larger than this.
For an INT data type, the range is from -2,147,483,648 to 2,147,483,647. If the identity value grows beyond this limit, you'll get this arithmetic overflow error.
Therefore, to work around the issue of the IDNET value reaching its max value...
1. We ran a reseed query - run this f the max value of the table is reached, and it will set it back to 1.
DBCC CHECKIDENT (‘SSRV2ObjectLock’, RESEED, 1);
2. Cold start the mon_config_service (Deactivate-Activate)
3. Checked the profiles and they were no longer hung/spinning hourglass
Other queries run:
select p.query_plan, ex.session_id,ex.blocking_session_id,db_name(ex.database_id)as dbname,s.host_name,s.program_name,s.login_name, ex.status,ex.command, ex.last_wait_type,ex.cpu_time,ex.reads,ex.writes,ex.percent_complete from sys.dm_exec_requests ex
INNER JOIN sys.dm_exec_sessions s ON ex.session_id = s.session_id
cross apply sys.dm_exec_query_plan(ex.plan_handle) p