Aria Automation service fails to load with error "The connection to the server vra-k8s.local:6443 was refused - did you specify the right host or port?"
search cancel

Aria Automation service fails to load with error "The connection to the server vra-k8s.local:6443 was refused - did you specify the right host or port?"

book

Article ID: 374846

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Aria Automation UI fails to load and upon checking the Kubernetes node or service status returns error "The connection to the server vra-k8s.local:6443 was refused - did you specify the right host or port?"

Kubelet service fails to come online (Active) after the node reboot or K8s reinitialize 
Status: "exit status is 255"

Running journalctl -xeu kubelet contains entries similar to

Aug 14 01:25:40 applianceFQDN.vmware.com kubelet[5669]: F0126 01:25:40.942105 5669 server.go:266] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
Aug 14 01:25:40 applianceFQDN.vmware.com kubelet[5669]: E0126 01:25:40.941998 5669 bootstrap.go:264] Part of the existing bootstrap client certificate is expired

Environment

VMware Aria Automation 8.x
VMware Aria Automation Orchestrator 8.x

Cause

This issue is caused by the kubelet service certificate expiring after one year.

Resolution

Based on the Deployment type follow the below steps to resolve this issue

Single VA Deployment

  1. Take a snapshot of the vRA VM.
  2. Locate an etcd backup at /data/etcd-backup/ and copy the selected backup to /root
  3. Reset Kubernetes by running vracli cluster leave
  4. Restore the etcd backup in /root by using the command based on the Aria Automation version
    Examples:
    8.0 - 8.8:  # /opt/scripts/recover_etcd.sh --confirm /root/backup-12345
    8.12 and newer:  # vracli etcd restore --local --confirm /root/backup-123456789.db; systemctl start etcd

  5. Extract VA config from etcd with
    kubectl get vaconfig -o yaml | tee > /root/vaconfig.yaml
  6. Reset Kubernetes once again using
    vracli cluster leave
  7. Run to Install the VA config
    kubectl apply -f /root/vaconfig.yaml --force
  8. Run vracli license to confirm that VA config is installed properly.
  9. Run
    /opt/scripts/deploy.sh

Clustered VAs Deployment with 3 Nodes

  1. Take a snapshots of all 3 nodes.
  2. Let's call one of the nodes a primary node. On the primary node, locate a etcd backup at /data/etcd-backup/ and preserved in /root.
  3. Reset each node with
    vracli cluster leave
  4. On the primary node, restore the etcd backup taken at /root using the /opt/scripts/recover_etcd.sh command
Example:
8.0 - 8.8:  # /opt/scripts/recover_etcd.sh --confirm /root/backup-12345 
8.12 and newer:  # vracli etcd restore --local --confirm /root/backup-123456789.db; systemctl start etcd
  1. Extract VA config from etcd with
    kubectl get vaconfig -o yaml | tee > /root/vaconfig.yaml
  2. Reset the node once again with
    vracli cluster leave
  3. Install VA config with
    kubectl apply -f /root/vaconfig.yaml --force
  4. Run vracli license to confirm that VA config is installed properly.
Notevracli license is not applicable for vRO and CExP installations.
  1. Join the other 2 nodes in the cluster by running the following command on each
     vracli cluster join [primary-node] --preservedata
  2. Run /opt/scripts/deploy.sh from the primary node

Additional Information

https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/
https://github.com/kubernetes/kubeadm/issues/1753