Kafka connection failures in Upgrade Automation run to upgrade Topology
search cancel

Kafka connection failures in Upgrade Automation run to upgrade Topology

book

Article ID: 438313

calendar_today

Updated On:

Products

Network Observability CA Performance Management

Issue/Introduction

When running an Upgrade Automation (UA) based Portal Topology upgrade we're seeing failures. We use the following command.

ansible-playbook -i inventory -K topology-upgrade.yaml

 

When running the following command we're generating additional logging to the CLI (-vvv), and redirecting it to a log file we're calling upgrade.txt (> upgrade.txt).

ansible-playbook -vvv -i inventory -K topology-upgrade.yaml > upgrade.txt

 

We see message entries like the following in the resulting upgrade.txt file.

 

TASK [Fail if Kafka is not accessible after retries] ***************************
task path: /opt/CA/installer/NetOps-Topology-25.4.4-Linux-RELEASE/provisioning/roles/netops_kafka/tasks/kafka_topics_update_tasks.yaml:46
[ERROR]: Task failed: Action failed: Kafka broker is not accessible at <Kafka_Host_IP>:9092 after 10 retries. Please ensure Kafka is running.
Origin: /opt/CA/installer/NetOps-Topology-25.4.4-Linux-RELEASE/provisioning/roles/netops_kafka/tasks/kafka_topics_update_tasks.yaml:46:3

 

When we examine the kafka server to validate the correct processes are running we find the expected two Kafka processes running. We see both netops-kafka and netops-kafka-zookeeper as active and running.

When we examine what is tied to the kafka port 9092 we see it tied to the loopback IP instead of the servers real IP address.

[root@<KafkaHost> ~]# netstat -lntup | grep 9092
tcp6       0      0 127.0.0.1:9092          :::*                    LISTEN      406670/java

Environment

All supported Network Observability DX NetOps Portal Topology installations

Cause

Incorrectly configured /etc/hosts file on the kafka server host. The localhost line was found configured as follows.

  • 127.0.0.1 <FQDN> <ShortHostName> localhost localhost.localdomain localhost4 localhost4.localdomain4

Resolution

Reset the /etc/host file so that the file contains a loopback entry as follows:

  • 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

Edit the file. Save the changes.

Once completed stop and restart the kafka services.

Confirm kafka port 9092 is using the correct real IP for the host.

Rerun the topology upgrade.