vSphere High Availability (vSphereHA) Configuration Fails After vCenter Server 8.0 Update 3 Upgrade with "HA Agent Unreachable."
search cancel

vSphere High Availability (vSphereHA) Configuration Fails After vCenter Server 8.0 Update 3 Upgrade with "HA Agent Unreachable."

book

Article ID: 392316

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vCenter Server 8.0

Issue/Introduction

  • After upgrading vCenter Server Appliance (VCSA) to version 8.0 Update 3, the configuration of vSphere High Availability (vSphere HA) fails, and the status of the ESXi hosts is shown as "HA Agent Unreachable." 

  • Adding new ESXi host fails with error "HA Agent Unreachable" 

     

    The error observed in the /var/log/fdm.log:


YYYY-MM-DDTHH:MM:SS In(166) Fdm[2104596]: [Originator@6876 sub=Message opID=WorkQueue-5acc7ca9] The SNI assigned to client is: [example.host.com]
YYYY-MM-DDTHH:MM:SS In(166) Fdm[2104382]: [Originator@6876 sub=Message opID=WorkQueue-270c69d1] The SNI assigned to client is: [example.host.com]
YYYY-MM-DDTHH:MM:SS In(166) Fdm[2104377]: [Originator@6876 sub=Message opID=WorkQueue-1ff72c5e] The SNI assigned to client is: [example.host.com]
YYYY-MM-DDTHH:MM:SS In(166) Fdm[2104382]: [Originator@6876 sub=Message opID=WorkQueue-270c69d1] Initiating verification using CA store; peerName: [example.host.com]
YYYY-MM-DDTHH:MM:SS In(166) Fdm[2104596]: [Originator@6876 sub=Message opID=WorkQueue-5acc7ca9] Initiating verification using CA store; peerName: [example.host.com]
YYYY-MM-DDTHH:MM:SS In(166) Fdm[2104377]: [Originator@6876 sub=Message opID=WorkQueue-1ff72c5e] Initiating verification using CA store; peerName: [example.host.com]
YYYY-MM-DDTHH:MM:SS Db(167) Fdm[2106424]: [Originator@6876 sub=Cluster opID=WorkQueue-270c69d1] IP X.X.X.X marked bad for reason Unreachable IP
YYYY-MM-DDTHH:MM:SS In(166) Fdm[2106424]: [Originator@6876 sub=Message opID=WorkQueue-270c69d1] Destroying connection
YYYY-MM-DDTHH:MM:SS Db(167) Fdm[2104385]: [Originator@6876 sub=Cluster opID=WorkQueue-1ff72c5e] IP X.X.X.X marked bad for reason Unreachable IP
YYYY-MM-DDTHH:MM:SS In(166) Fdm[2104385]: [Originator@6876 sub=Message opID=WorkQueue-1ff72c5e] Destroying connection
YYYY-MM-DDTHH:MM:SS Db(167) Fdm[2106355]: [Originator@6876 sub=Cluster opID=WorkQueue-5acc7ca9] IP X.X.X.X marked bad for reason Unreachable IP
YYYY-MM-DDTHH:MM:SS In(166) Fdm[2106355]: [Originator@6876 sub=Message opID=WorkQueue-5acc7ca9] Destroying connection

Environment

ESXi 8.0.3

vCenter 8.0.3

Cause

This issue occurs because vCenter Server 8.0 Update 3 introduces a new feature that validates the certificates of all hosts in the inventory when configuring vSphere HA.

vCenter Server 8.0 Update 3 introduces a new validation mechanism that checks the certificates of all hosts in the inventory during vSphere HA configuration. If the certificates are not correctly configured or the certificate mode is not set to VMCA (VMware Certificate Authority), the "HA Agent Unreachable" error may appear.

Resolution

To resolve this issue:

  1. Ensure that SHA01 is not present in the host certificate, if it is this will not work.
  2. The certificate mode in vCenter should be set to VMCA (VMware Certificate Authority). Follow the steps in Tech doc how to Change the ESXi Certificate Mode
  3. Disconnect and reconnect all of the host/hosts in the cluster to ensure the certificate are pushed to the host.
  4. Check that the certificate are present in the host by browsing to the host in the vSphere Client inventory --> Click Configure --> Under System, click Certificate as per the Tech doc how to Renew or Refresh ESXi Certificates.
  5. Then re-enable the vSphere HA (Error: "Cannot find vSphere HA master agent" within vCenter UI):
    1. Browse to the cluster in the vSphere Client.
    2. Click the Manage tab and click Settings.
    3. Under Services, click Edit.
    4. Uncheck the Turn ON vSphere HA option.
    5. Click OK.
    6. Click Settings and select Turn ON vSphere HA.
    7. Click OK.