data_engine probe is red and can't connect due to session in kill state error after patching activity
search cancel

data_engine probe is red and can't connect due to session in kill state error after patching activity

book

Article ID: 122212

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

 Data_engine probe is red and can't connect to UIM DB after patching activity in the DB server.

Environment

  • MS SQL 2012, 2014, 2016, 2019, Clustered DB
  • UIM All supported versions

Cause

 Suspect the issue is occurred after a Windows patch activity, the DB failover to the other node, and the session got in kill state.

Resolution

When failover happens automatically or manually, data_engine probe will detect DB connection errors. In data_engine log (lv2):

For maintenance task:


Dec 3 08:04:03:936 [37296] de: ADO_Database::ExecuteSP - failed for spn_de_DataMaint
Dec 3 08:04:03:936 [37296] de: [Admin] ADO_Database::ExecuteSP com error for - 1 errors Dec 3 08:04:03:936 [37296] de: (1) ADO_Database::ExecuteSP com error for [Microsoft OLE DB Provider for SQL Server] Cannot continue the execution because the session is in the kill state.


For db writer threads:

Dec 3 08:04:12:752 [32556] de: ReinsertRows - ERROR: Code=0x80004005 Source=Microsoft OLE DB Provider for SQL Server Description=[DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.
Dec 3 08:04:12:752 [20280] de: ReinsertRows - ERROR: Code=0x80004005 Source=Microsoft OLE DB Provider for SQL Server Description=[DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.


But all writer threads will attempt to reconnect:

Dec 3 08:04:13:813 [38032] de: QoSInsert::Disconnect_BulkInsert - restarting committ threads
Dec 3 08:04:13:824 [26352] de: commit_thread - worker thread started
...
Dec 3 08:04:46:655 [38032] de: QoSInsert::Reconnect_BulkInsert - restarting committ threads
Dec 3 08:04:46:666 [9800] de: commit_thread - worker thread started
Dec 3 08:04:46:666 [38032] de: qos_check - connected to the database


data_engine.log


 COM Error [0x80004005] Unspecified error - [Microsoft SQL Server Native Client 11.0] Cannot continue the execution because the session is in the kill state. 


Run the below query in the DB to see the active DB sessions

select substring(dms.program_name,0,26) as prog_name, dms.host_name, dms.login_name, count (*) as count,
sum( dms.memory_usage) *8 as total_kb, sum(dms.cpu_time) as total_cpu from sys.dm_exec_sessions dms
group by substring(dms.program_name,0,26), host_name, login_name
order by 5 desc, 6 desc;


You shall see client program name 'data_engine [RN_QOS_DATA_' on top of the list and these are sessions created on the DB server for the data_engine writer threads.


 Please consult with your DBA to resolve the issue.

the below options seen to resolve the issue:

a) failback to the original cluster node 
b) determine why the failover node would not come up and instead threw a session kill state error

Additional Information

OLE DB driver does not support multi-subnet failover, according to MS Article 'Always On Client Connectivity ': 

https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/always-on-client-connectivity-sql-server?view=sql-server-2017 

List of supported connection parameters for the driver used: 

'SQLOLEDB' - the Microsoft OLE DB 'Provider' for SQL Server used in data_engine 8.50: 

https://docs.microsoft.com/en-us/sql/ado/guide/appendixes/microsoft-ole-db-provider-for-sql-server?view=sql-server-2017 

'SQLNCLI11' -  the SQL Server Native Client OLE DB provider introduced in data_engine 9.02: 

https://docs.microsoft.com/en-us/sql/relational-databases/native-client/applications/using-connection-string-keywords-with-sql-server-native-client?view=sql-server-2017#ole-db-provider-connection-string-keywords