The data_engine is responsible for inserting QOS and discovery data into your SQL database. It is also responsible for doing nightly maintenance on your database to re-index all of the tables and move current QOS to history and drop the history based on your retention settings.
Below are some tips to make the data_engine and database work more efficiently.
CA UIM 8.x or higher
1) Make sure re-indexing is enabled
There is a check box to automatically re-index tables. This should be checked by default. if it is not checked then your DBA should be checking and re-indexing the database on a regular basis. The default for UIM is every day.
2) Speed up deletion of records
The data_engine's nightly maintenance removes old data. If you are running MS SQL Server Enterprise Edition, you can turn on Partition data tables. This turns the delete into an I/O Operation verses an SQL delete query with a WHERE clause. The SQL server when deleting partitions drops the whole partition in a single I/O operation to remove the history.
3) Tuning of the data_engine to handle more messages
The data engine pulls data from the hub static queue called 'data_engine' by default. By default this pulls at a bulk rate of 100 messages. This can be seen on the Hub GUI status screen.
This can be increased up to 2000 when using hub 7.X or higher.
This is set in the data_engine raw config.
If you are seeing a backup in the Hub GUI status screen for your data_engine this is the first thing you can try to help expedite the process.
4) Increasing the data_engine throughput to the backend Database server by making it multi-threaded
The data_engine by default is single-threaded. For most small to medium sized installations this is enough with modifications to the bulk_size only. (See above)
For larger installations with 10’s of thousands of QOS messages per minute this may not be enough. For these clients we can make the data_engine multi-threaded.
To make the data_engine multi-threaded you will need to add two new keys in raw config IF they currently do not exist.
thread_count_insert = XX
queue_limit_total = 100000
The thread_count_insert should be set to a number equal to the total number of CPU cores on your primary hub where the data_engine sits. For most large installations this rarely needs to be set above 8 along with having your hub_bulk_size set to 1750 max.
Checking the number of cores on a system:
Check number of CPU cores using the Windows performance monitor->Resource monitor or run the command:
wmic cpu get NumberOfCores, NumberOfLogicalProcessors/Format:List
cat /proc/cpuinfo | grep -i cpu | grep MHz | wc –l
cat /proc/cpuinfo | grep processor
The thread_count_insert key can make the data_engine insert data much faster.
5) Increase the time allowed for the data_engine to complete maintenance on all tables
The data_engine by default allows the maintenance jobs to completed in 2 hours (7200 seconds). If partitioning is not enabled or if there are a large number of RN_QOS_DATA, HN_QOS_DATA, DN_QOS_DATA, and BN_QOS_DATA tables, or one or more of these tables contain a large number of entries, the maintenance many not complete in 2 hours. In that case, the maintenance is rolled back and changes are not committed. It is recommended that this timeout be increased to the maximum time allowed (65536) which is a little more than 18 hours..
To increase the time to allow for maintenance to complete, modify the following key value from the data_engine probe's Raw Configure GUI:
data_management_timeout = 65536
This parameter setting is a mechanism that enables data_engine to "slow down" and stop reading any more messages from the hub. If you enable multiple threads you open the funnel, so data_engine will process as many messages from the hub data_engine queue as the hub can deliver.
The data_engine will have a background thread process for those messages sorted into bulk objects and marked ready for commit. This can lead to a lot of messages that reside in data_engine memory, e.g., if the database happens to fall behind (typically during maintenance or when you restart SQL Server). The default is 100,000.
So with this number at 100,000 (typed as 100000 in the configuration file, without spaces or dots), the data_engine will back off, and suspend reading messages from the hub, if it for any reason happened to have over 100,000 messages in memory that are still not committed to database. When the number drops below 100k, it will resume reading messages from the hub again.