Products

DX Operational Intelligence DX OI SaaS

Issue/Introduction

The following is a list of common issues related to the DX OI integration with UIM

Environment

DX Operational Intelligence 2x, SaaS

Resolution

Troubleshooting – Configuration Issues

1. Probe logs has less information for troubleshooting purposes.

Root cause:
The default log level is 1-Error.

Solution:
You may change the log level using the UI. The log levels 5-Trace & 4-Debug provides very detailed information and those should be used only for troubleshooting purposes. These may degrade performance on a production system.
The default logsize is 10240 (10 MB); you may update the logsize can from the Raw Configure setup/logsize option.

2. Probe configuration does not open from Admin Console.

Root cause:
The issue is related to the PPM probe, which helps in rendering the configuration UI.

Solution:
Check the version of the PPM probe. Make sure the PPM version matches the UIM release version. Verify the logs. Restart PPM probe.

It may also happen if there is any connectivity issue between PPM robot (in general it is a primary hub) and robot hosting oi_connector probe (if different from PPM robot). You may check connectivity using telnet or any other tool.

3. Probe configuration options are missing.

Root cause:
The issue is related to PPM probe, which helps in rendering the configuration UI.

Solution:
Check the version of the adapter JAR file (oi_connector-adapter-*.jar) within PPM probe folder. Delete the JAR from PPM folder (Nimsoft\probes\service\ppm\adapterlib). Delete files present in (\Nimsoft\probes\service\ppm\cache\AttributionClient) folder. Restart OI_Connector probe.

If the issue remains, backup oi_connector.cfg file. Delete the OI_Connector probe & redeploy it and restore configuration file. Restart the probe.

4. Verify Tenant ID & Connectivity to Elastic Search does not work. (v 1.38 and previous)

Root cause:
The customers integrating on premise UIM with SaaS OI, these options does not work.

Solution:
These options are not applicable when integrating with SaaS OI. However, we recommend verifying the endpoint URL & Tenant ID.

5. There is no way to configure proxy without credentials. Proxy configuration has user name & password as mandatory fields in the UI.

Root cause:
The username & password fields are mandatory fields from UI validation.

Solution:
Provide some dummy values for username & password for proxy. Open the raw configure option of the probe and remove the values in the user name (resource/properties/proxy_user) & password (resource/properties/proxy_password) fields. Save the configuration. The probe will be restarted and uses Proxy without authentication.

6. The probe fails to validate connection to the Jarvis & NASS.

Root cause:
Some network issues prevent probe connecting to Jarvis & NASS.

Solution:
From the robot on which the probe is deployed (typically primary hub), verify the network connectivity to Jarvis & NASS Hosts. Please check if there is any firewall/proxy causing the connection issue.

Troubleshooting – Operational Issues

1. Oi_connector failed to send data to oi or fails to send after running for some time.

Root cause:
Some internal processing errors may prevent Probe from posting the data. This issue typically happens when there is some problem with connecting to the queues.
Solution:
Verify any exceptions towards the end of the connector logs to identify the issue. If logs specify any of the below messages, restarting the probe would resolve the issue.
*Got NimException connecting to queue group_info*
*Unable to open a client session for 127.0.0.1:48002*
*Unable to open a client session for 127.0.0.1:48002: Connection refused: connect*

We recommend monitoring the logs using Logmon and perform the probe restart when probe logs has above messages logged.

2. Oi_connector not sending all metrics/groups.

Root cause:
The probe is not configured properly.

Solution:
Open the probe configuration and verify the probe list is configured for the required probes. Make sure your group selection includes all Groups or leave it empty.
If no group is selected, the probe sends data for all groups in the system. This configuration takes care of any new groups created after initial configuration.

3. OI_Connector throws out of memory exception.

Root cause:
The metrics/alarms/inventory generated in the environment is large to be handled by default configuration.

Solution:
The default configuration of the probe requires maximum of 2 GB of heap space. This works fine with inventory size of 10000 servers. Increase memory by 2GB for every 10000-inventory items.
Open probe Raw Configuration, update -Xms1g -Xmx2g parameters of startup/options section.

4. OI_Connector failed to close few alarms.

Root cause:
The issue is related to cache handling of the probe.

Solution:
Probe may not be able to close all alarms due to some processing exceptions.
However, alarm reconciliation address these issues if configured to run at specified intervals.

5. Axagtw queues are increasing on the hub.

Root cause:
The configuration is not optimal for the incoming QOS/Alarm message.

Solution:
We recommend updating configuration parameters. For detailed information, refer to the link.

6. Alarm processing failed with message metric_unique_id not available for alarm with nimid : xxxxxx

Root cause:
The issue occurs when there is an issue in device lookup for the alarm.

Solution:
Check if the following query returns data.

SELECT CD.* FROM NAS_ALARMS NA
INNER JOIN CM_CONFIGURATION_ITEM_METRIC CIM ON CIM.ci_metric_id = NA.met_id
INNER JOIN CM_CONFIGURATION_ITEM CI ON CI.ci_id = CIM.ci_id
INNER JOIN CM_DEVICE CD ON CD.dev_id = CI.dev_id
WHERE NA.met_id = ' xxxxxx'

If no data returned by the query, it means there is an issue with discovery and require investigation from discovery point of view.

Additional Information

https://knowledge.broadcom.com/external/article/190815/aiops-troubleshooting-common-issues-and.html

DX OI integration with UIM - Common Issues

Article ID: 237389

Updated On: