Exit Code 10 on Metric Collector processing
search cancel

Exit Code 10 on Metric Collector processing

book

Article ID: 257365

calendar_today

Updated On:

Products

Information Centric Analytics

Issue/Introduction

The SQL Server Agent logs a failure for the RiskFabric Processing job step Metric Collector, but the message logged does not indicate the cause of failure and no failure details are logged in the table RiskFabric.dbo.Log_DataTransformation. The body of the SQL Server Agent log message shows multiple successful metric scrapes and is similar to the following:

Executed as user: <user>. ...gured via code  2022-09-12 14:34:40,279 [1:INFO] Connections.GetAnalysisServicesConnectionString() Init: Retrieving Default Analysis Server from Risk Fabric Portal Settings...  2022-09-12 14:34:40,279 [1:INFO] Connections.GetAnalysisServicesConnectionString() Init: SELECT [Name],[Value] FROM dbo.[PortalSettings] WHERE Name in ('ASDBServer','ASDBDatabase')2022-09-12 14:34:41,592 [1:INFO] Program.Main() Init: **SUCCESS: Database validated.'  2022-09-12 14:34:41,592 [1:INFO] Connections.CreateAnalysisServicesDatabaseWorker() Analysis Server Connection String: Data Source=ZAPSDCMSQL3091;Catalog=RiskFabric2022-09-12 14:34:43,561 [1:INFO] Connections.CreateAnalysisServicesDatabaseWorker() Init: Default Analysis Server is used when ServerName/Database is not overriden in Metrics table.  2022-09-12 14:34:43,561 [1:INFO] Program.DoRunSnapshotCollector() Metric: Loading all Metrics into cache.2022-09-12 14:34:48,811 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Multiple Methods Per Hour - Users Last 30 Days' scraped successfully.2022-09-12 14:34:51,663 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Same File Name - Blocked then Allowed - Users Last 30 Days' scraped successfully.2022-09-12 14:34:54,523 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Same Match Count - Blocked then Allowed - Users Last 30 Days' scraped successfully.2022-09-12 14:34:57,521 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Large Files to USB - Users Last 30 Days' scraped successfully.2022-09-12 14:35:00,426 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'High Match Count Unusual - Users Last 30 Days' scraped successfully.2022-09-12 14:35:03,331 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named '5 Email Blocks / Day  - Users Last 30 Days' scraped successfully.2022-09-12 14:35:06,331 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'High Severity Unusual - Users Last 30 Days' scraped successfully.2022-09-12 14:35:09,308 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Unusual Policy Volume - Users Last 30 Days' scraped successfully.2022-09-12 14:35:12,421 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Repeat Policy Offender - Users Last 30 Days' scraped successfully.2022-09-12 14:35:15,328 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Team Broken Process - Users Last 30 Days' scraped successfully.2022-09-12 14:35:18,218 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Department Broken Process - Users Last 30 Days' scraped successfully.2022-09-12 14:35:21,245 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Contractor Unusual - Users Last 30 Days' scraped successfully.2022-09-12 14:35:24,122 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Planned Departure - Users Last 30 Days' scraped successfully.2022-09-12 14:35:26,982 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Terminated Employee - Users Last 30 Days' scraped successfully.2022-09-12 14:35:30,767 [1:INFO] MdxQueryProcessor.QueryMdxKpiByOrganizationAndCountry() Metric: Snapshot for Analyzer Metric named 'Total DIM Incidents Last 30 Days' scraped successfully.2022-09-12 14:35:35,414...  Process Exit Code 10.  The step failed.

Enabling enhanced logging for this step reveals the following:

2022-10-27 02:35:10<c/>627 [1:ERROR] Program.Main() Exception<nl/>System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.

This failure occurs after a 10-minute wait (600000 ms):

2022-10-27 02:25:10<c/>555 [1:INFO] MetricSnapshotController.CopySnapshotsToMetricHistory() Metric: bulk copy completed successfully.
2022-10-27 02:35:10<c/>627 [1:ERROR] ...

Environment

Release : 6.x through 6.6.0

Cause

The cause of this behavior is a hardcoded timeout set in the RiskFabric Collector executable. SQL metric commands inherit a 10 minute timeout, which overrides the SQL connection timeout value. This inherited value was meant to be 0, or no timeout. This timeout may be reached while processing metrics for an unusually large number of days, as might occur following a system recovery or a delay in processing spanning multiple days.

Resolution

Beginning with 6.6 MP1 (6.6.1.0), the timeouts in the Metric Collector process will use the general porting settings SQL Command Timeout (Seconds) and Analysis Services Timeout in Seconds, which are user-configurable.