SDDC Manager Pre-check failing due to the vCenter in an ERROR status.
search cancel

SDDC Manager Pre-check failing due to the vCenter in an ERROR status.

book

Article ID: 375145

calendar_today

Updated On:

Products

VMware SDDC Manager VMware Cloud Foundation

Issue/Introduction

  • The pre-check on the SDDC Manager is failing for the MGMT vCenter with the below error:

    (1) One of the workflows ("Add Domain", "Password Manager") has put the VC in Error state. Check for current/previous workflow in the task panel in UI. Remediation messages for corresponding task can be found on Task Aggregator on UI. Follow the remediation steps for resolving the issue. (2) Check for failed VC/PSC upgrade, wait for a few minutes and retry.

  • In some cases we see the below error is displayed on the SDDC Manager:

    Update not possible while vcsa is in failed state.
    Unable to update the VC, Error: Update not possible while [vc] and undefined is in failed state.          
  • During "Plan Patching" you will see following error on SDDC UI:

    Domain <Workload Domain Name> inventory state is not active.

  • Logs on SDDC Manager located at /var/log/vmware/vcf/lcm/lcm.log shows entries similar to:

    ERROR [vcf_lcm,627######3d60,1528] [c.v.e.s.l.s.impl.UpgradeServiceImpl,http-nio-127.0.0.1-7400-exec-10] Failed to get resource name for ESX_HOST with id 535b####-####-####-########5822
    ERROR [vcf_lcm,627######3d60,1528] [c.v.e.s.l.a.i.i.LogicalInventoryClient,http-nio-127.0.0.1-7400-exec-10] logical inventory - get ESXi host failed for ESXi host ID b4c4####-####-####-########2d59 org.springframework.web.client.HttpClientErrorException$NotFound: 404 : "{"errorCode":"RESOURCE_NOT_FOUND_WITH_ID","arguments":["ResourceInventoryController","b4c4####-####-####-########2d59"],"message":"Resource ID: b4c4####-####-####-########2d59 not found in ResourceInventoryController","causes":[{"type":"com.vmware.evo.sddc.inventory.model.InventoryNotFoundException","message":"Resource ID: b4c4####-####-####-########2d59 not found in Esxi"}],"referenceToken":"R309LB"}"
    k.web.client.HttpClientErrorException.create(HttpClientErrorException.java:113)

  • Logs on SDDC Manager located at /var/log/vmware/vcf/commonsvcs/vcf-commonsvcs.log shows entries similar to:

    INFO  [common,529####a50f,154d] [c.v.e.s.i.r.a.c.ResourceInventoryController,http-nio-127.0.0.1-7100-exec-2938] getHosts(): id = b4c4####-####-####-########2d59
    INFO  [common,529####a50f,154d] [c.v.e.s.i.s.EsxiInventoryServiceImpl,http-nio-127.0.0.1-7100-exec-2938] Get Esxi - b4c4####-####-####-########2d59
    ERROR [common,529####a50f,154d] [c.v.e.s.i.d.s.client.TypedClientImpl,http-nio-127.0.0.1-7100-exec-2938] Inventory Error, Resource ID: b4c4####-####-####-########2d59 not found in Esxi
    ERROR [common,529####a50f,154d] [c.v.e.s.i.s.EsxiInventoryServiceImpl,http-nio-127.0.0.1-7100-exec-2938] Inventory Error, Resource ID: b4c4####-####-####-########2d59 not found in Esxi
    ERROR [common,529####a50f,154d] [c.v.e.s.e.h.LocalizableRuntimeExceptionHandler,http-nio-127.0.0.1-7100-exec-2938] [QLVA6P] RESOURCE_NOT_FOUND_WITH_ID Resource ID: b4c4####-####-####-########2d59 not found in ResourceInventoryController com.vmware.evo.sddc.inventory.model.InventoryNotFoundException: Resource ID: b4c4####-####-####-########2d59 not found in ResourceInventoryController
            at com.vmware.evo.sddc.inventory.rest.api.controller.ResourceInventoryController.getHosts(ResourceInventoryController.java:521)
            at jdk.internal.reflect.GeneratedMethodAccessor261.invoke(Unknown Source)

Environment

VMware Cloud Foundation 4.x
VMware Cloud Foundation 5.x

Cause

The VC status has been marked as ERROR state due a pervious workflow/task failure.
Usually, this is a failed password rotation task.

When LCM precheck is run, it fails at 'VC Inventory Status Check' due to VCENTER <id> status in ERROR which is an expected behavior.

Example of the proposed remediation displayed in the failed pre-check:

Resolution

You may follow either of the options to reset the VC State to "ACTIVE": 

# Option1: 

  1. SSH to SDDC Manager with vcf user and su to root
  2. Find the vCenter in ERROR state in SDDC database

    psql -h localhost -U postgres -d platform -c "select id,vm_hostname,status from vcenter where status!='ACTIVE'"

    Sample output:

                      id                  |     vm_hostname     | status
    --------------------------------------+---------------------+--------
     2730####-####-####-########d780 | vcsa02.example.com | ERROR
    (1 row)

  3. Update the vCenter status to ACTIVE in SDDC Database

    psql -h localhost -U postgres -d platform -c "update vcenter set status='ACTIVE' where id='<id from Step #2 >'"
  4. Run the pre-check from SDDC Manager

# Option 2: 

  1. SSH to SDDC Manager VM with vcf user and elevate to root with su

  2. List the upgrades from LCM to determine the ID of the failedDomains:

    • curl localhost/lcm/inventory/upgrades | json_pp

      Example output:
  3. Check the state of the VC from the SDDC DB:

    • Connect to the Platform DB
      • psql -h localhost -U postgres -d platform
    • Check the current state of the Inventory using the id from output above:
      • select * from vcenter where id='57452ad9-####-####-####-########';

        Example output:

    • Exit the postgres shell
      • #\q
    • Use the following API command to set the VC state to ACTIVE:
      • curl localhost/inventory/entities/57452ad9-####-####-####-######## -X PATCH -d '{"type" : "VCENTER","status":"ACTIVE"}' -H 'Content-Type:application/json' 

  4. (Optional) Connect to the DB again and recheck the state of the vCenter to confirm that the status shows ACTIVE
  5. Re-run the Pre-upgrade check again and we should no longer see the VC Inventory state error, and you should be able to proceed with the VCF upgrade.