The Jarvis Schema Registry and Kafka are not running. Trying to restart them Jarvis using ./startServices -j causes the Jarvis Schema Registry to start but it dies after only a few seconds.
Starting Kafka using ./startServices -k restarts Kafka but also for only a few seconds.
./healthCheck.sh shows that the following processes are X--DOWN--X
Jarvis Schema Registry
Kafka Server
The Zookeeper Server shows --RUNNING-> but has no pid.
All other processes are running.
The schema-registry.log shows a timeout trying to connect to the zookeeper process:
[2020-06-25 14:30:57,297] ERROR Server died unexpectedly: (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain:51)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server 'vPIC481S0135:2181' with timeout of 30000 ms
The kafka log shows a similar error:
[2020-06-25 14:30:59,447] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 6000
The zookeeper.log shows that the process has failed:
[2020-06-25 14:30:42,530] ERROR Unexpected exception, exiting abnormally (org.apache.zookeeper.server.ZooKeeperServerMain)
java.io.EOFException
Application Experience Analytics (AXA) 17.3.2
This is most likely a known Apache ZooKeeper bug:
https://stackoverflow.com/questions/44217654/how-to-recover-zookeeper-from-java-io-eofexception-after-a-server-crash
The root cause of the problem is that a log file from a prior run of ZooKeeper was written with an incomplete header.
The workaround is to delete the offending log file:
Check the log.xxx files under the zookeeper data folder, /opt/ca/aoPlatform/jarvis/kafka_2.11-0.10.1.0/kafka_data/zookeeper/version-2, to see if there is any file with a size of 0 bytes. If there is, delete the file and restart the AXA processes.