Failed to remediate or configure HA on a single-image managed cluster
search cancel

Failed to remediate or configure HA on a single-image managed cluster

book

Article ID: 323306

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vCenter Server

Issue/Introduction

 

  • The issue happens on a vCenter Server which has a cluster managed by a Single Image.

  • When configuring vSphere HA , following error message appears - "The vSphere HA agent is not reachable from vCenter Server. Cannot find vSphere HA master agent."

  • When remediating cluster, following error message might appear - "Unknown error occurred when invoking host API"

  • On the impacted ESX (either HA is in unreachable status, or is reporting issue with remediation). Under vmkernel.log following log entries are reported :

    /var/run/log/vmkernel.log

    YYYY-MM-DDTHH:MM:SS cpu13:3088341)Admission failure in path: host/vim/vmvisor/settingsd:settingsd.3088338:uw.3088338
    YYYY-MM-DDTHH:MM:SS cpu13:3088341)UserWorld 'settingsd' with cmdline '/usr/lib/vmware/lifecycle/bin/settingsd'
    YYYY-MM-DDTHH:MM:SS cpu13:3088341)User: 3212: settingsd: wantCoreDump:settingsd signal:11 exitCode:0


    Note: Check the size of "/var/vmware/lifecycle/task-status.json" file is over 5MB at the time

Cause

The internal Lifecycle task status file on ESX can get large in size over time and in 7.0 U1 there was a design issue with cleaning the file up.

Resolution

  • Put the impacted ESX host in Maintenance Mode. 

  • Open an SSH session to the ESXi host and run the following command : 

    echo "{}" > /var/vmware/lifecycle/task-status.json
    rm -f /var/vmware/lifecycle/lifecycle.task


  • Reboot the ESXi host and retry the vSphere HA task.