Data_engine probe is red and can't connect to UIM DB after patching activity in the DB server.
Suspect the issue is occurred after a Windows patch activity, the DB failover to the other node, and the session got in kill state.
When failover happens automatically or manually, data_engine probe will detect DB connection errors. In data_engine log (lv2):
For maintenance task:
Dec 3 08:04:03:936 [37296] de: ADO_Database::ExecuteSP - failed for spn_de_DataMaint
Dec 3 08:04:03:936 [37296] de: [Admin] ADO_Database::ExecuteSP com error for - 1 errors Dec 3 08:04:03:936 [37296] de: (1) ADO_Database::ExecuteSP com error for [Microsoft OLE DB Provider for SQL Server] Cannot continue the execution because the session is in the kill state.
For db writer threads:
Dec 3 08:04:12:752 [32556] de: ReinsertRows - ERROR: Code=0x80004005 Source=Microsoft OLE DB Provider for SQL Server Description=[DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.
Dec 3 08:04:12:752 [20280] de: ReinsertRows - ERROR: Code=0x80004005 Source=Microsoft OLE DB Provider for SQL Server Description=[DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.
But all writer threads will attempt to reconnect:
Dec 3 08:04:13:813 [38032] de: QoSInsert::Disconnect_BulkInsert - restarting committ threads
Dec 3 08:04:13:824 [26352] de: commit_thread - worker thread started
...
Dec 3 08:04:46:655 [38032] de: QoSInsert::Reconnect_BulkInsert - restarting committ threads
Dec 3 08:04:46:666 [9800] de: commit_thread - worker thread started
Dec 3 08:04:46:666 [38032] de: qos_check - connected to the database
data_engine.log
COM Error [0x80004005] Unspecified error - [Microsoft SQL Server Native Client 11.0] Cannot continue the execution because the session is in the kill state.
Run the below query in the DB to see the active DB sessions
select substring(dms.program_name,0,26) as prog_name, dms.host_name, dms.login_name, count (*) as count,
sum( dms.memory_usage) *8 as total_kb, sum(dms.cpu_time) as total_cpu from sys.dm_exec_sessions dms
group by substring(dms.program_name,0,26), host_name, login_name
order by 5 desc, 6 desc;
You shall see client program name 'data_engine [RN_QOS_DATA_' on top of the list and these are sessions created on the DB server for the data_engine writer threads.
Please consult with your DBA to resolve the issue.
the below options seen to resolve the issue:
a) failback to the original cluster node
b) determine why the failover node would not come up and instead threw a session kill state error
OLE DB driver does not support multi-subnet failover, according to MS Article 'Always On Client Connectivity ':
https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/always-on-client-connectivity-sql-server?view=sql-server-2017
List of supported connection parameters for the driver used:
'SQLOLEDB' - the Microsoft OLE DB 'Provider' for SQL Server used in data_engine 8.50:
https://docs.microsoft.com/en-us/sql/ado/guide/appendixes/microsoft-ole-db-provider-for-sql-server?view=sql-server-2017
'SQLNCLI11' - the SQL Server Native Client OLE DB provider introduced in data_engine 9.02:
https://docs.microsoft.com/en-us/sql/relational-databases/native-client/applications/using-connection-string-keywords-with-sql-server-native-client?view=sql-server-2017#ole-db-provider-connection-string-keywords