On heavily populated systems there was always a bottleneck between NSE registration and processing logic, related to the common statistic table [EventQueue], which is populated and modified by both sides.
ITMS 8.0, 8.1, 8.5, 8.6
In release ITMS 8.0 we removed the absolute (real-time) synchronization for the [EventQueue] table for both sides – instead, it will be updated on a timely basis as well as for some corner cases by a specific call to “update query”.
“ProcessingState” of the event entry in this table will only have two values:
It will be a simple insert of a single row into the “EventQueueEntry” table.
Note: This call will be executed without transaction but with deadlock retries.
It will be an insert of two rows:
Note: Because it’s a multiple-row operation - the call will always be under the transaction.
The queue pull will be triggered by:
When Dispatcher performs a pull, it will:
Note: Because it’s a multiple-row operation - the call will always be under the transaction
It is possible (but highly unlikely) that a service crash or code bugs will lead to the situation when events are marked as “processing”, while there is no activity in Dispatcher for them running.
To fix the stale queue entries, we have a timed action (adjustable, Core Setting: “EvtQueueFixupMinutes”), which will be performed to find out the queues without any processing activities in Dispatcher.
The default time span to perform an action is 10 minutes and can be triggered by these conditions:
Note: Fix-up will only perform for a particular queue when the queue is not pending any events, no completes are queued and no workers are active.
Since Dispatcher’s logic also depends on the knowledge of what is really “pending”, it is still a good idea to update the statistics in the “EventQueue” table.
The recalculation of the queues will be done automatically by all parties – both registration and processing, but not in the same queries, as it was before (spRegister.. / spGetCandidates…).
There will be a few situations when recalculation will occur:
Note: The main Core service (AeXSvc) will always have a reload timeout of 1 minute less, than Core Setting: this will eventually make it an “update master”.
There are several event sources:
All these processes can be a significant source of events, so there should be logic to minimize the pressure for the statistics recalculation.
It is accomplished by the “update master” approach:
Effectively, there should be only one “update master” – AeXSvc, but in some cases, any of the parties can trigger the logic if it will become a flooding source.