Aria Automation service fails to load with error "The connection to the server vra-k8s.local:6443 was refused

search cancel

Aria Automation service fails to load with error "The connection to the server vra-k8s.local:6443 was refused - did you specify the right host or port?"

book

Article ID: 374846

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Aria Automation UI fails to load and upon checking the Kubernetes node or service status returns error "The connection to the server vra-k8s.local:6443 was refused - did you specify the right host or port?"

Kubelet service fails to come online (Active) after the node reboot or K8s reinitialize
Status: "exit status is 255"

Running journalctl -xeu kubelet contains entries similar to

Aug 14 01:25:40 applianceFQDN.vmware.com kubelet[5669]: F0126 01:25:40.942105 5669 server.go:266] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
Aug 14 01:25:40 applianceFQDN.vmware.com kubelet[5669]: E0126 01:25:40.941998 5669 bootstrap.go:264] Part of the existing bootstrap client certificate is expired

Environment

VMware Aria Automation 8.x
VMware Aria Automation Orchestrator 8.x

Cause

This issue is caused by the kubelet service certificate expiring after one year.

Resolution

Based on the Deployment type follow the below steps to resolve this issue

Single VA Deployment

Take a snapshot of the vRA VM.
Locate an etcd backup at /data/etcd-backup/ and copy the selected backup to /root
Reset Kubernetes by running vracli cluster leave
Restore the etcd backup in /root by using the command based on the Aria Automation version
Examples:
8.0 - 8.8: # /opt/scripts/recover_etcd.sh --confirm /root/backup-12345
8.12 and newer: # vracli etcd restore --local --confirm /root/backup-123456789.db; systemctl start etcd

Extract VA config from etcd with

kubectl get vaconfig -o yaml | tee > /root/vaconfig.yaml

Reset Kubernetes once again using
```
vracli cluster leave
```

Run to Install the VA config

kubectl apply -f /root/vaconfig.yaml --force

Run vracli license to confirm that VA config is installed properly.
Run
```
/opt/scripts/deploy.sh
```

Clustered VAs Deployment with 3 Nodes

Take a snapshots of all 3 nodes.
Let's call one of the nodes a primary node. On the primary node, locate a etcd backup at /data/etcd-backup/ and preserved in /root.
Reset each node with
```
vracli cluster leave
```
On the primary node, restore the etcd backup taken at /root using the /opt/scripts/recover_etcd.sh command

Example:

8.0 - 8.8:  # /opt/scripts/recover_etcd.sh --confirm /root/backup-12345

8.12 and newer:  # vracli etcd restore --local --confirm /root/backup-123456789.db; systemctl start etcd

Extract VA config from etcd with

kubectl get vaconfig -o yaml | tee > /root/vaconfig.yaml

Reset the node once again with
```
vracli cluster leave
```

Install VA config with

kubectl apply -f /root/vaconfig.yaml --force

Run vracli license to confirm that VA config is installed properly.

Note: vracli license is not applicable for vRO and CExP installations.

Join the other 2 nodes in the cluster by running the following command on each
```
 vracli cluster join [primary-node] --preservedata
```
Run /opt/scripts/deploy.sh from the primary node

Additional Information

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/
https://github.com/kubernetes/kubeadm/issues/1753

Feedback

thumb_up Yes

thumb_down No