Kafka stops during multi-node install

book

Article ID: 194868

calendar_today

Updated On:

Products

CA App Experience Analytics

Issue/Introduction

 

When the installation is run it is started on SERVER3 (ES, K, J) then after about a minute SERVER4 & SERVER5 (ES, K) together. 

Whichever is started first of 4 & 5 completes with Kafka server down and the other will stay on STATUS:...... until a stop and start of the one that failed does resolve this.

 

Environment

Release : 17.3

Component : APP EXPERIENCE ANALYTICS ENGINE

Resolution

 

Here are some suggestions to investigate this sort of problem:

1)

Check the Kafka Logs (under /opt/ca/jarvis/logs/) called server.log and zookeeper logs called zookeeper.log

That should tell what the issue is and why KAFKA starts and stops

 

2)

Has been seen on a one node install regularly. Kafka needs to be restarted manually to make sure it runs. Once that is done, it usually continues to run.

Had stopped running because of missing disk space in the beginning. 

3)

Possibly Kafka topic didn’t get created. Maybe it is a kafka/zookeeper unique id issue  

 

Zookeeper Error: Consumer not on the list of brokers. Or an error related to broker ids

 

Jarvis installer touches the zookeeper.properties file, and if incorrect IP address is detected by Jarvis, the broker ids in this file might get overwritten

Need to manually make sure the broker id's are in sync with each other in three zookeeper/kafka files:

 

 

1.   /jarvis/kafka_2.11-0.10.1.0/config/kafka_data/zookeeper/myid -file id number 

2. → has to be the same as /jarvis/kafka_2.11-0.10.1.0/config/server.properties -file parameter: broker.id=16867130

 

3. /jarvis/kafka_2.11-0.10.1.0/config/zookeeper.properties -file list of brokers must match the value from the above two, and the host name:

 

server.16867130=axa173.localdomain:2888:3888

.... etc

 

After the zookeeper.properties file matches unique broker id's in myid -file and server.properties file for each node, restart zookeeper/kafka

The installer constructs the IDs in those files based on IP addresses - make sure the hostname and /etc/hosts -files are properly configured on each node….and that the IP addresses used are the correct ones, and that if FQDN is used as a hostname, it is in first place in /etc/hosts against the IP address"