The purpose of this guide is to assist users who are facing a critical situation where the data_engine probe is non-functional (either down or not inserting data).
This document has been created using a data-driven approach and focuses on the top most common scenarios encountered during a “production down” situation. It does not cover every possible scenario or case, but rather focuses on the most common/most likely scenarios that are encountered by customers in production environments that lead to a “Severity 1” production outage.
According to data collected by Broadcom Support, the vast majority of data_engine problems classified as “Severity 1” or “production down” are related to a change or issue within the database server or network environment outside of DX UIM itself, and usually require the involvement of a customer’s DBA and/or network administration team.
Therefore, it stands to reason that when the data_engine probe stops working, the environment should be the first point of investigation; this is especially true in a case where the probe had been working previously and there have been no recent product configuration changes. In such cases, there is likely a problem, outage, or change in the environment itself as opposed to a DX UIM product issue, therefore the necessary teams to investigate these factors should be engaged early in the troubleshooting process.
We have categorized the most commonly-seen outage scenarios and their causes so that the most likely environmental factors leading to data_engine outages can be quickly identified, appropriate teams (DBA/network) can be engaged when necessary, and appropriate steps can be taken to resolve the situation and allow normal operation to resume without delay.
Most of the issues described here are outside the domain of the DX UIM product itself, for example, database server issues or firewall configurations; where appropriate, links to relevant knowledge articles or technical documentation are included for guidance.
KB Articles and technical documents linked will be targeted toward MS SQL Server in most cases as this represents the largest portion of our user base, but similar articles for MySQL/Oracle can usually be found by searching the Broadcom Knowledge Base.
Rather than focus on specific log messages or error messages, we will look at broader categories of behavior to help direct troubleshooting efforts.
Situations that are considered critical for the data_engine probe can be categorized into two broad categories:
Either way, the causes of such outages or failures can generally be sorted into the following categories (listed in order of frequency):
Common resolutions (and the percentage of these scenarios resolved):
Common issue descriptions include:
Common resolutions/percentages:
Common issue descriptions:
Common resolutions/percentages:
After you have corrected any issues within the environment, you may need to take additional steps to restore functionality to the DX UIM environment.
At a minimum, you should restart the DX UIM primary hub robot and/or HA Hub (if applicable), and then restart any robot(s) that host instances of Operator Console or CABI.
If the credentials used to connect to the database were changed as part of the resolution, you will need to consult this article to ensure the password is updated appropriately across the installation.
If the database server itself has been changed (e.g. database migration or server IP change) you should consult this article for the necessary changes.
For the complete DX UIM data_engine technical documentation, please refer to the data_engine probe document at the following link: data_engine.
For additional troubleshooting information for issues you might encounter while upgrading, configuring, or using different versions of the data_engine probe check the following link: data_engine troubleshooting
For data_engine best practices see the following link: data_engine best practices
If this article has not been helpful and the issue appears to be related to the data_engine probe itself please consult this article for further possible scenarios.