VCF Automation interface returns "no healthy upstream" error after virtual machine suspension
search cancel

VCF Automation interface returns "no healthy upstream" error after virtual machine suspension

book

Article ID: 426412

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • Users may encounter a no healthy upstream error when attempting to access the VMware Aria Automation interface.
  • This typically occurs after the virtual machines have been suspended or improperly shut down.
  • Fleet Management power off/poweron may fail with error LCMVCFA20006 when attempting to communicate with the environment.

Environment

  • VCF Automation 9.x

Cause

  • This issue is caused by the suspension of cluster nodes at the hypervisor level.
  • This action "stuns" the cluster, breaking database heartbeats and synchronization between Kubernetes pods, leaving services in an unresponsive or unbalanced state upon resume.

Resolution

  • To resolve this issue, you must perform a graceful restart of the cluster nodes:
    1. Shut down all VCF Automation appliance nodes gracefully.

    2. Power on the nodes from VC. 

    3. Wait for approximately 15–20 minutes for all services to initialize.

    4. Verify service health by running the following command on any node:

      kubectl get pods --all-namespaces
      kubectl get pods -n prelude
    5. Ensure all pods show a Running or Completed status.