Attempting to create a Aria Operations for Networks cluster fails during 'Configuring Roles' and 'Installing HDFS'
search cancel

Attempting to create a Aria Operations for Networks cluster fails during 'Configuring Roles' and 'Installing HDFS'

book

Article ID: 430792

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite) VCF Operations for Networks

Issue/Introduction

  • Attempting to create/expand Aria Operations for Networks cluster fails at the 'Configuring Roles' and 'Installing HDFS' stage

  • The clustering.log shows the following error

2026-02-11T21:32:03.569Z ERROR create_cluster.py MainThread create_cluster.py:196 Failed executing command errcode[1536] command:sudo ssh -q -i /home/support/.ssh/id_rsa_vnera_cluster_keypair -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null  [email protected] "sudo -S su ubuntu -c 'hdfs zkfc -formatZK -force'"
2026-02-11T21:32:03.569Z ERROR create_cluster.py MainThread create_cluster.py:199 Exiting.. cluster not configured
2026-02-11T21:32:03.575Z ERROR create_cluster.py MainThread create_cluster.py:209 cluster.constants.FAILURE_DESCRIPTION

2026-02-11T21:32:03.717Z ERROR create_cluster_from_ovas.py MainThread create_cluster_from_ovas.py:57 Got exception
Traceback (most recent call last):
  File "/home/ubuntu/build-target/deploymentmanager/create_cluster_from_ovas.py", line 55, in main
    create_cluster.start(options.config, options.ips, options.vip, is_enterprise, new_def_id.strip())
  File "/home/ubuntu/build-target/deploymentmanager/cluster/create_cluster.py", line 224, in start
    raise err
  File "/home/ubuntu/build-target/deploymentmanager/cluster/create_cluster.py", line 178, in start
    step_manager.start()
  File "/home/ubuntu/build-target/deploymentmanager/cluster/steps/step_manager.py", line 20, in start
    s.start()
  File "/home/ubuntu/build-target/deploymentmanager/cluster/steps/cluster_step.py", line 44, in start
    raise e
  File "/home/ubuntu/build-target/deploymentmanager/cluster/steps/cluster_step.py", line 39, in start
    self._execute()
  File "/home/ubuntu/build-target/deploymentmanager/cluster/steps/configure_roles.py", line 57, in _execute
    services_reconf.update(self._apply_hdfs_role())
  File "/home/ubuntu/build-target/deploymentmanager/cluster/steps/configure_roles.py", line 240, in _apply_hdfs_role
    msg_desc).apply()
  File "/home/ubuntu/build-target/deploymentmanager/cluster/roles/role_applier.py", line 55, in apply
    raise e
  File "/home/ubuntu/build-target/deploymentmanager/cluster/roles/role_applier.py", line 45, in apply
    self._execute()
  File "/home/ubuntu/build-target/deploymentmanager/cluster/roles/hdfs_role.py", line 30, in _execute
    self.install_hdfs()
  File "/home/ubuntu/build-target/deploymentmanager/cluster/roles/hdfs_role.py", line 223, in install_hdfs
    raise e
  File "/home/ubuntu/build-target/deploymentmanager/cluster/roles/hdfs_role.py", line 133, in install_hdfs
    self._install_hdfs_nn(n, format_zk)
  File "/home/ubuntu/build-target/deploymentmanager/cluster/roles/hdfs_role.py", line 236, in _install_hdfs_nn
    utils.exec_remote(connection, ip, "hdfs zkfc -formatZK -force")
  File "/home/ubuntu/build-target/deploymentmanager/utils.py", line 546, in exec_remote
    return exec_cmd(_get_cmd(connection, address, cmd), num_retries)
  File "/home/ubuntu/build-target/deploymentmanager/utils.py", line 528, in exec_cmd
    return exec_ignorable(cmd, False, num_retries)
  File "/home/ubuntu/build-target/deploymentmanager/utils.py", line 486, in exec_ignorable
    raise Exception("Failed executing command errcode[%d] command:%s" % (code, cmd_log))
Exception: Failed executing command errcode[1536] command:sudo ssh -q -i /home/support/.ssh/id_rsa_vnera_cluster_keypair -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null  [email protected] "sudo -S su ubuntu -c 'hdfs zkfc -formatZK -force'"
2026-02-11T21:32:03.718Z INFO create_cluster_from_ovas.py MainThread create_cluster_from_ovas.py:65 Total Time Taken 720.955 sec

Environment

Aria Operations for Networks 6.x

Cause

This issue occurs due to port 2181 not being open between platform nodes which is required for cluster setup and communication between zookeeper servers on other nodes

Resolution

Revert/restore the Aria Operations for Networks environment back to a healthy state before the clustering process began

Open port 2181 between platform nodes and re-run the cluster scale out process once again

Additional Information

To test port connectivity between platform nodes run the following command below

nc -zv <host> 2181