Configuring vSphere HA on an image-based cluster fails.
search cancel

Configuring vSphere HA on an image-based cluster fails.

book

Article ID: 384913

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

  • Enabling vSphere HA on the cluster fails with the below errors

    "A general system error occurred: Installing HA Components failed on the cluster: domain-cxx".
    Cannot complete the configuration of the vSphere agent on the host. "Applying HA VIBs on the cluster encountered a failure". Failed installing HA component on the host: host-xxxx.

  • Error in /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server.log:

    YYYY-MM-DDTHH:MM:SS error vmware-vum-server[#######] [Originator@#### sub=VumVapi::Utils opID=01b95dec-5b0c-####-####-0b0e6d9bae95] [DepotContentManager 696] Failed to get cached component. No record - VCIDB ERROR: Row with primary key (vsphere-fdm, #.#.#-########) not found in table PM_DEPOT_COMPONENTS

Note :  #.#.#-######## notation is for the vsphere-fdm version which is different for each vCenter build number.

  • Error in /var/log/vmware/vmware-updatemgr/vum-server/imageservice.log:

    INFO imageService[1401######22048] [SoftwareSpecMgr ####] Image validation result: {'info': [], 'warnings': [], 'errors': [{'id': 'com.vmware.vcIntegrity.lifecycle.EsxImage.ComponentNotFoundError', 'message': {'id': 'com.vmware.vcIntegrity.lifecycle.EsxImage.ComponentNotFoundError', 'default_message': 'Component vsphere-fdm cannot be found in depot.', 'args': ['vsphere-fdm']}, 'resolution': None, 'time': 'YYYY-MM-DD'}]}  vmware.esximage.Errors.ComponentNotFoundError: ('vsphere-fdm', '#.#.#-########', 'Could not find the component with name = vsphere-fdm, version = #.#.#-######## in the depot.')
  • The ESXi host where the vSphere HA configuration is failing will have an older version of the vsphere-fdm agent present, which can be confirmed using the following command from the ESXi host (The build number of the vCenter and the vsphere-fdm vib should match)

esxcli software vib list | grep -i fdm

  • Validating an image at cluster level on the vCenter results in "Image Validation Failed".

  • Under vCenter's, /var/core directory there are core.updatemgr-worker.##### files are present.

Environment

  • vCenter 9.x
  • vCenter 8.x
  • vCenter 7.x
  • ESX 9.x
  • ESXi 8.x
  • ESXi 7.x

Cause

The vCenter is not able to get the cached component (vsphere-fdm vib) from the VCDB of the updatemgr service psql table PM_DEPOT_COMPONENTS, after vCenter update/upgrade.

Resolution

This is a known issue. Broadcom engineering is actively working on a fix in a future release.

Workaround

  1. Take snapshot of vCenter VM (Powered off snapshot of all vCenter Servers if in ELM)
  2. SSH to vCenter server with root credentials.
  3. Enter the below command to enable the shell:

    shell

  4. Stop the update manager service using the below command:

    service-control --stop vmware-updatemgr

  5. Access the postgres Database using the below command:

    su updatemgr -s /bin/bash

    psql -U vumuser -d VCDB

  6. View the below two tables before deleting the required entries:

    table pm_software_desired_states;

    table pm_software_compliances;

  7. Delete the entries from the same two tables if all clusters experience the same issue or there is only one cluster in the environment

    DELETE FROM pm_software_compliances;

    DELETE FROM pm_software_desired_states;

    Note:
    • If there are more than one cluster in your environment and only one cluster is impacted, then use below command to delete the desired Cluster details only:

      DELETE FROM pm_software_compliances where desired_state_id in (select desired_state_id from pm_software_desired_states where entity_id='domain-cXXX');

      DELETE FROM pm_software_desired_states where entity_id='domain-cXXX';

    • To get the Cluster domain ID, in the vCenter UI select the cluster from inventory. And note down the cluster domain ID from the URL of the browser. It should be similar to domain-cXXXX: exclude everything after the :
  8. Quit and exit from the DB by using the below command:

    \q and hit Enter.

  9. Start shell session as root user, by using the below command:

    su root -s /bin/bash/

  10. Start the update manager service, using the below command:

    service-control --start vmware-updatemgr

  11. Recreate the cluster image from vCenter UI. For more information, refer Creating and Managing vSphere Lifecycle Manager Clusters
  12. Re-enable the vSphere HA. For more information, refer Disabling and enabling VMware vSphere High Availability (vSphere HA)