Clean up evicted cluster node configuration left-overs
search cancel

Clean up evicted cluster node configuration left-overs

book

Article ID: 325872

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:
  • Adding new vRealize Automation node to an existing cluster fails. Similar error messages are seen when running the vracli cluster join command:
[ERROR] Error executing k8s cluster join command.
Traceback (most recent call last):
File "/opt/python-modules/vracli/commands/cluster.py", line 203, in join_handler
result = subprocess.check_output(join_cmd.split(), stderr=subprocess.STDOUT)
File "/usr/lib/python3.6/subprocess.py", line 336, in check_output
**kwargs).stdout
File "/usr/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['kubeadm', 'join', 'vra-k8s.local:6443', '--token', '41cmbo.vrafuvakvfvmsy5c', '--discovery-token-ca-cert-hash', 'sha256:419084277905e17b5b5f9f3659609a5e63349cb94874fa782e3ee2509bfb3900', '--experimental-control-plane', '--v', '5']' returned non-zero exit status 1.
Error executing k8s cluster join command


Environment

VMware vRealize Automation 8.x

Cause

Incomplete configuration from a previously removed node prevents adding new nodes to the cluster.

Resolution

This is a known issue.

Workaround:
  1. SSH login to one of the nodes in the cluster.
  2. Execute the following commands:
source /opt/scripts/etcd_utils.sh
_etcdctl member list
_etcdctl member remove <etcd_member_id>
kubectl edit configmaps -n kube-system kubeadm-config
  1. Delete the apiEndpoint section for the evicted cluster node:
Example:
evicted-node.domain.com:
advertiseAddress: 10.71.225.115
bindPort: 6443
  1. Save the changes similar to VI editor: Press ESC key and type:
:wq