The following is a list of techniques and suggestions to employ when troubleshooting OI connector issues
DX NetOps OI Connector 2.1.x ONLY
NOTE: OI Connector 2.1.x is supported with DX NetOps Performance Management 20.2.4 or higher
a) Check compatibility
b) Check OI values entered during installation are correct:
APM Gateway Hostname |
If OI 20.2.x: - If OI SaaS or 21.x: |
APM Gateway Security Token |
If OI 20.2.x: Option 1) Login to the Tenant using and Admin account, go to DX Application Performance Management (APM)> Settings> Security > Generate New Token > select “Tenant” or "Agent" Token If OI SaaS or 21.x |
Tenant ID |
If OI 20.2.x: a) Obtain the "ElasticSearch endpoint" - If Openshift: oc get routes -n<dxi-namespace> | grep jarvis b) Open your browser, list all available tenants: If OI SaaS or 21.x |
a) Check that the OI Connector services are up and running
service caperfcenter_oiconnector status
service caperfcenter_oiagent status
service kafka status
b) Check that the OI Connector services in NetOps console
If OI SaaS or 21.x:
Go to Performance Center > Administration > System Status page
Locate the "OI Connector" section
Verify Status = Normal
If OI 20.2.x:
the OI Connector will report the status as failed. You can ignore the status.
a) Review OIConnector logs : <OIConnector-HOME>/logs
- OIConnector.log : main log file
- OIAgent*.log : NFA, ADA data collection activity
b) Enable DEBUG logging:
a) OIConnector logging: <OIConnector-HOME>/conf/log4j.xml
Open ./conf/log4j.xml, change logging level from INFO to DEBUG as below:
...
<!-- ***** Root Logger definition ***** -->
<root>
<level value="DEBUG"/>
<appender-ref ref="console"/>
<appender-ref ref="complete" />
</root>
b) OI Agent service logging: <OIConnector-HOME>/conf/agent-wrapper.conf
Uncomment the below line:
#wrapper.app.parameter.2=-Ssupport
You need to restart the oi OI Agent service:
service caperfcenter_oiagent restart
c) Example of common errors or exceptions :
USE-CASE #1 : Problem with apmservices-gateway endpoint
ERROR [pool-2-thread-3] [TASGroupTask] - [EVENT UNSPECIFIED Anonymous:null@unknown -> /com.ca.im.oinet.connector.task.group.TASGroupTask] Failed ingesting groups to TAS for CAPC tenant id : <example> Error: 503
Recommendation:
Verify that the apm-gateway endpoint is correct and available and token is correct
USE-CASE #2 : CAPM user password expired, changed or is not longer valid.
ERROR [] [WrapperSimpleAppMain] [OIIntegration] - [EVENT UNSPECIFIED Anonymous:<user>@example -> /com.ca.im.oinet.connector.OIIntegration] No response from webservice - unable to configure data sources
WARN [WrapperSimpleAppMain] [OIIntegration] - [EVENT UNSPECIFIED Anonymous:null@unknown -> /com.ca.im.oinet.connector.OIIntegration] Unable to determine CA Performance Center version
Recommendation:
Update the <OIConnector-HOME>/conf/config.xml with the new encoded password, see: https://knowledge.broadcom.com/external/article/204144/dx-oi-oiconnector-not-connecting-when-c.html
d) Search for common keywords : “Successfully", "Started", "CLIENT_SUMMARY_NASS"
NOTE: CLIENT_SUMMARY_NASS are emitted every five minutes.
Below are some examples:
...
[INFO] [Thread-9] NASSClient - <number>: Started NASS Client.
...
INFO] [Thread-9] [PersistentRegistrationCache] - Successfully loaded 47762 metric registrations from /opt/CA/OIConnector/conf/MetricRegistrationCache-<number>.ser
[INFO ] [pool-3-thread-38] RemoteDataConnectionImpl - [EVENT UNSPECIFIED <user>:@<host> -> /NetOps OI Connector/com.ca.im.oinet.connector.sources.RemoteDataConnectionImpl] JARVIS_INGEST_RECORD_COUNT : 471
..
[INFO ] [pool-3-thread-39] TASGroupTask - [EVENT SUCCESS <user>:@<host>n -> /NetOps OI Connector/com.ca.im.oinet.connector.task.group.TASGroupTask] Successfully ingested groups to TAS for CAPC tenant id: _default_
..
[INFO ] [NASSClientStats] NASSClient - <number>: CLIENT_SUMMARY_NASS_INGEST_SUCCESS_COUNT: 45449
[INFO ] [NASSClientStats] NASSClient - <number>: CLIENT_SUMMARY_NASS_INGEST_FAILED_COUNT: 0
[INFO ] [NASSClientStats] NASSClient - <number>: CLIENT_SUMMARY_NASS_INGEST_RETRIED_COUNT: 0
[INFO ] [NASSClientStats] NASSClient - <number>: CLIENT_SUMMARY_NASS_REGISTRATION_SUCCESS_COUNT: 662
[INFO ] [NASSClientStats] NASSClient - <number>: CLIENT_SUMMARY_NASS_REGISTRATION_FAILED_COUNT: 0
...
[INFO ] [pool-3-thread-4] InventoryTaskImpl - [EVENT SUCCESS <user>:@<host> -> /NetOps OI Connector/com.ca.im.oinet.connector.task.inventory.InventoryTaskImpl] Successfully ingested inventory (268 vertices) in 0 batches TAS for CAPC tenant id : _default_
1) A quick way to find out that metrics are getting ingested into DX OI is by checking that MetricRegistrationCache-<Tenant-ID>.ser exist in the conf folder
Check for the file creation in the OIConnector log:
..
INFO [Thread-9] [PersistentRegistrationCache] - Successfully loaded 47762 metric registrations from /opt/CA/OIConnector/conf/MetricRegistrationCache-<number>.ser
2) Use the below steps to debug a metric ingestion problem from Data Aggregator to DX OI :
2.1) Go to Data Aggregator (DA): check the settings in $KARAF_HOME/etc/kafkaexport.producer.cfg file are correct:
feature.enabled=on
producer.bootstrap.servers=<kafkabroker:port>
For more information refer to the documentation : https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/performance-management/21-2/integrating/integrate-with-dx-operational-intelligence/install-the-oi-connector.html#concept.dita_577458ae-0372-4fe5-addb-51330456748d_Install
2.2) Check for ‘ProducerStatisticsMonitor’ in DA’s $KARAF_HOME/data/log/karaf.log file. These are emitted every five minutes by default.
If ProducerStatisticsMonitor shows that messages are being dropped, look in DA’s $KARAF_HOME/data/log/KafkaClient.log file for errors/hints to the problem.
If ProducerStatisticsMonitor logs are not seen, check whether export configuration has been set up and applied to devices:
a) In a browser, open http(s)://<DAHOST>:<DAPORT>/debug
b) Click on Available Spring Containers (by bundle)
c) Provide PC admin credentials if prompted
d) Click on com.ca.im.data-manager.core.aggregator.loader.integrator bundle link
e) Click on exportProfileCache link
f) Verify that there is an ExportProfileConfig defined and it has the expected exportedMetricFamilyQNames.
g) Verify that the exportedDeviceCout (sic) is non-zero.
Here is an example illustrating a problem during installation, export configuration was not setup correctly:
h) If there is no ExportProfileConfig or there are no exported devices associated
- Check DataAggregator(DA)’s karaf.log for possible data corruption, if possible restart DA
- Check OIConnector logs for failures in creating the config or associating it with collections
2.3) Verify that the messages are truly getting to Kafka topic. On the Kafka broker (based on default standalone kafka):
cd <oi-connector-kafka>/kafkadisk/bin
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic metric-export
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic metric-export --from=beginning
If no data is flowing, then check kafka/zookeeper logs for potential problems.
If data is flowing, then check the OIConnector.log file for logs containing “CLIENT_SUMMARY’, which are emitted every five minutes.
If any show failures, enable DEUB logging for more details.
a) Metrics (NASS)
Go to Performance:
In SaaS, 21.x
In 20.2.x on premise:
b) Inventory and Topology (TAS)
Go to DX OI > Services > Create a new Service
From Add Elements,, select Network > Device Names, you should be able to see your NetOps devices, below an example:
** This section is valid for DX On Premise 20.x, 21.x versions only, if you are using DX OI SaaS, contact Broadcom Support for assistance **
a) Alarms(ElasticSearch)
For details how to query elasticsearch refer to : https://knowledge.broadcom.com/external/article/207215
1) List all the UIM product indices:
http://<servername>/_cat/indices/*capm*?v
For example:
http://<host>/_cat/indices/*capm*?v
Check that doc.count and size columns values increases over the time.
2) Check the content of a specific index:
http://<severname>/<index-name>/_search?pretty&sort=@timestamp:desc&size=500
For example:
http://<host>/ao_itoa_groups_capm_1_1/_search?pretty&sort=@timestamp:desc&size=500
You can use https://www.epochconverter.com/ to convert values from @timestamp field to human-readable format,
b) Inventory and Topology (TAS)
Option 1: Use DX Dashboard > AIOps Inventory source, see: https://knowledge.broadcom.com/external/article/226599
Option 2: User REST APIs:
Open Postman (you can download postman from https://www.postman.com/downloads/)
POST API End Point to check TAS data for UIM inventory:
http://<APMServices Gateway Host>/tas/graph/query
For example:
http://apmservices-gateway.<host>/tas/graph/query
Headers:
Content-Type: application/json
Authorization: Bearer <Tenant Token>
Body:
{
"filter": {
"op": "JOIN",
"input": {
"op": "AND",
"input": [
{
"op": "ATTRIBUTE",
"expressions": [
{
"name": "Product",
"values": [
"CAPC"
]
}
]
}
]
}
},
"universe": null,
"version": null,
"time": 0,
"stitchingEnabled": true,
"includeStatus": true
}
Expected Result: you should see all new vertices added to TAS
c) Metrics(NASS)
Option 1: Use DX Dashboard > AIOps Metadata source
Option 2: User REST APIs:
Open Postman (you can download postman from https://www.postman.com/downloads/)
POST API End Point to check NASS Metric Metadata matching a pattern
http://<APM Service Gateway Host>/metadata/queryMetric
For example:
http://apmservices-gateway.<host>/metadata/queryMetric
Headers:
Content-Type: application/json
Authorization: Bearer <Tenant Token>
Body:
{
"size": 10000,
"specifier": {
"op": "SPEC",
"sourceNameSpecifier": {
"op": "REGEX",
"pattern": "(.*)NetOps\\|CAPM(.*)|(.*)NetOps\\|ADA(.*)|(.*)NetOps\\|NFA(.*)"
},
"attributeNameSpecifier": {
"op": "ALL"
}
}
}
Expected Result: you should see all new vertices added to NAS
** This section is valid for DX On Premise 20.x, 21.x versions only, if you are using DX OI SaaS, contact Broadcom Support for assistance **
AIOps - Jarvis (kafka, zookeeper, elasticSearch) Troubleshooting
C) WHAT FILES SHOULD I COLLECT FOR BROADCOM SUPPORT?
If you still need assistance, contact Broadcom Support (https://support.broadcom.com/) and provide the below information:
a) DEBUG oi_connector logs
<OIConnector>/logs/*
<OIConnector>/conf/config.xml
b) services status:
service caperfcenter_oiconnector status
service caperfcenter_oiagent status
c) from data aggregator
$KARAF_HOME/etc/kafkaexport.producer.cfg
$KARAF_HOME/data/log/karaf.log file
$KARAF_HOME/data/log/KafkaClient.log
screenshot of exportProfileCache content
d) from kafka
Result of:
cd <oi-connector-kafka>/kafkadisk/bin
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic metric-export
If you are using DX OI On Premise (20.x, 21.x):
a) cluster and pods status:
kubectl get pods -n<namespace>
kubectl describe nodes -n<namespace>
kubectl get events -n<namespace>
b) from ElasticSearch
- collect result of below queries:
http(s)://{es_endpoint}/_cat/indices/*capm*?v
http(s)://{es_endpoint}/_cat/indices/?v&s=ss:desc&h=health,store.size,pri.store.size,pri,rep,store.size,pri.store.size,docs.count,docs.deleted,index,cds
http(s)://{es_endpoint}/_cluster/health?pretty&human
Fore more details refer to : https://knowledge.broadcom.com/external/article/207215
- result of : df -h
c) from NFS server
- result of : df -h