Aria Automation fails to start with blank /etc/hosts file contents
search cancel

Aria Automation fails to start with blank /etc/hosts file contents

book

Article ID: 314799

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:
  • Following a reboot Aria Automation fails to start.
  • The health check output when calling /opt/scripts/deploy.sh fails on the nodes-ready test

Running check nodes-ready
make: *** [/opt/health/Makefile:56: nodes-ready] Error 1

  • The /var/log/deploy.log contains errors similar to:

<HOSTNAME> python3[99629]: [vracli] [DEBUG] executing bash on command-executor-9x4gf failed. Error: [: Command '['/usr/local/bin/kubectl', 'exec', '--namespace', 'kube-system', 'command-executor-9x4gf', '--', 'run-on-execd', '--', 'bash', '-c', '/opt/scripts/mon-fips.sh']' returned non-zero exit status 1.].Failed command: [['/usr/local/bin/kubectl', 'exec', '--namespace', 'kube-system', 'command-executor-9x4gf', '--', 'run-on-execd', '--', 'bash', '-c', '/opt/scripts/mon-fips.sh']].Exit code: [1]. Stderr: [error: unable to upgrade connection: Authorization error (user=kube-apiserver-kubelet-client, verb=create, resource=nodes, subresource=proxy)].

  • The systemd.journal log located under /var/services-logs/journal/ contains errors similar to:

Error: "MountVolume.SetUp failed for volume \"default-token-fcd6p\" (UniqueName: \"kubernetes.io/secret/<UUID>-default-token-fcd6p\") pod \"pipeline-ui-app-7c99f45659-772r4\" (UID: \"<UUID>\") : Get \"https://vra-k8s.local:6443/api/v1/namespaces/prelude/secrets/default-token-fcd6p\": dial tcp: lookup vra-k8s.local: Temporary failure in name resolution"

  • The /etc/hosts file is blank or missing content on one or more of the Aria Automation appliances

 

 


Environment

VMware vRealize Automation 8.x

Cause

The issue is caused by a race condition between the VAMI network settings boot-up scripts and the custom logic that is used to configure kubernetes to work with the CoreDNS service. With rare frequency the two services can attempt to update the /etc/hosts file at the same time which can blank the contents of the files.

Resolution

A resolution is planned for the upcoming Aria Automation 8.13.2 release.

Workaround:


1. Copy the /etc/hosts file entries for a functioning node in the cluster and update it on each affected node.

 

See example /etc/hosts file contents below from a VMware lab environment

 

EtcHostsKB.jpg