In some environments, cube processing jobs scheduled via the Processing Jobs tab in the IT Analytics (ITA) console show a status of 'Executing' but never complete. The SQL Server Agent Job Activity monitor likewise shows the job(s) in an interminable 'Executing' state. Despite this, Task Manager on both the SQL and Analysis Services (SSAS) servers show minimal active CPU, memory, and disk utilization.
Release : 2.9.x
Component : Jobs
Microsoft SQL Server 2016 SP2 cumulative update 7 (CU7) introduced a bug in cube processing that causes some jobs to run interminably when SQL Server Analysis Services is run on a server with fewer than 16 CPU cores. This bug affected all subsequent versions and builds of SQL Server through 2016 SP2 CU13, 2017 CU22, and 2019 CU8.
Microsoft fixed this bug in the following versions of SQL Server:
In addition to upgrading to the latest CU release, the setting CoordinatorSafeJobUnblocking must be set to 0 in the file msmdsrv.ini on the SSAS server.
For best processing performance, configure SQL Server and SSAS per the recommendations in the IT Analytics Server Administrator Guide and ensure the advanced SSAS setting ThreadPool \ Command \ MaxThreads is set to its default value (0).
The property CoordinatorSafeJobUnblocking was newly introduced in conjunction with a code change in SQL Server 2016 SP2 CU14 to address an issue in which multidimensional processing in SQL Server Analysis Services (SSAS) appears to hang. CoordinatorSafeJobUnblocking determines whether deep recursion is enabled during internal child job unblocking. This setting's default value is 1, meaning deep recursion is enabled for stability. This allows a large number of non-I/O threads to quickly enter the processing queue, but generally performance scales-up to compensate for this on machines with sufficient cores/processors. When this value is set to 0, deep recursion is ignored, which can mitigate the reported "hang" due to unexpected connection throttling that occurs during processing (as was observed in case 119081324005030) on machines with fewer than 16 cores.
When CoordinatorSafeJobUnblocking is set to 1 and cube processing jobs appear to hang, SQL Server Profiler traces will show the last query of SQL by SSAS completed successfully while multiple other attributes are being processed in parallel at the same time. Despite allowing the trace to run for several hours, no other events are captured and there will be no batch processing completion event.
The following MDX query run against the SSAS server will show a large number of lock requests in a pending state (1: The system is waiting to lock the object) after the job appears to reach a hung state:
SELECT * FROM $system.discover_locks