Symptoms:
- Automatic incident creation is not working
- Autoclosure is not working
DX Operational Intelligence 1.3.x, 20.x
DX Application Performance Management 11.x, 20.x
1) ElasticSearch running outofmemory
2) Kafka to zookeeper connectivity issues
CHECKLIST
STEP #1 : Jarvis (kafka, zookeeper, elasticSearch)
STEP #2 : Check Kafka consumer groups : find out if there is any problem or lag processing the messages
STEP #1 : Jarvis (kafka, zookeeper, elasticSearch)
DX AIOps - Jarvis (kafka, zookeeper, elasticSearch) Troubleshooting
https://knowledge.broadcom.com/external/article/189119
STEP #2 : Check below "Kafka consumer groups" : find out if there is any problem or lag processing the messages
Incidentmanagement
incidentmanagement _<#>
1) List all the consumer groups:
If OI 20.2:
/opt/ca/kafka/bin/kafka-consumer-groups.sh --bootstrap-server jarvis-kafka:9092,jarvis-kafka-2:9092,jarvis-kafka-3:9092 --list | grep -i incident
If OI 1.3.x:
/opt/ca/kafka/bin/kafka-consumer-groups.sh --bootstrap-server kafka:9092,kafka-2:9092,kafka-3:9092 --list | grep -i incident
2) Check consumer group: Incidentmanagement
If 20.2:
/opt/ca/kafka/bin/kafka-consumer-groups.sh --bootstrap-server jarvis-kafka:9092,jarvis-kafka-2:9092,jarvis-kafka-3:9092 --describe --group incidentmanagement
If OI 1.3.2:
/opt/ca/kafka/bin/kafka-consumer-groups.sh --bootstrap-server kafka:9092,kafka-2:9092,kafka-3:9092 --describe --group incidentmanagement_1026105401
Result: WARN Connection to node <xxx> could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
The above warning message and below results indicate: a) there is a kafka to zookeeper connectivity issue b) there is lag and c) there are not consumers
DX OI - Troubleshooting, Common Issues and Best Practices
https://knowledge.broadcom.com/external/article/190815/dx-oi-troubleshooting-common-issues-and.html