New VCF Instance deployment into existing VCF Fleet fails with a VCF Installer UI ERROR: Failed to poll status for VCF Operations fleet management deployment
search cancel

New VCF Instance deployment into existing VCF Fleet fails with a VCF Installer UI ERROR: Failed to poll status for VCF Operations fleet management deployment

book

Article ID: 435837

calendar_today

Updated On:

Products

VCF Operations

Issue/Introduction

Retrying the task from the VCF Installer UI fails with a similar error, and the deployment is failed at the stage "Join the existing fleet management appliance." 

 

 

 

 

  • The Fleet Manager appliance (VRLCM) is not responding to API calls during the PrepareVcfOperationsManagementImport task from the newly deployed SDDC Manager in the new instance.  Logging will look similar to the following (SDDC Manager /var/log/vmware/vcf/domainmanger.log):

    YYYY-MM-DDTHH:MM:SS.SSSS+0000 ERROR [vcf_dm,opID-#,###] [c.v.e.s.o.c.c.ContractParamBuilder,dm-exec-19##]  Contract task PrepareVcfOperationsManagementImport input: {"_executionId":"####UUID####","vrslcmSpec":{"vrslcmRootPassword":"*****","vrslcmApiPassword":"*****","virtualAppliance":{"hostname":"<FleetMgr_appliance_FQDN>"}}}
  • Further in the logging you may see a failure reason similar to the below:


    Caused by: com.vmware.evo.sddc.common.vrealize.vrlcm.VrlcmException: Failed to get VMware Aria Suite Lifecycle system info
     at com.vmware.evo.sddc.common.vrealize.vrslcm.VrslcmService.getSystemInfo(VrslcmService.java:249)
     at com.vmware.evo.sddc.common.vrealize.vrslcm.VrslcmService.getVersion(VrslcmService.java:294)
     at com.vmware.evo.sddc.vrealize.services.PrepareVcfOperationsManagementImport.execute(PrepareVcfOperationsManagementImport.java:57)
     ... 16 common frames omitted
    Caused by: org.springframework.web.client.ResourceAccessException: I/O error on GET request for "https://<FleetMgr_appliance_FQDN>/lcm/lcops/api/v2/settings/system-details": Connect to https://<FleetMgr_appliance_FQDN>:443 [<FleetMgr_appliance_FQDN>/<FleetMgr_appliance_IP>] failed: Connection timed out

Environment

VMware Cloud Foundation (VCF) 9.0.x
VMware SDDC Manager 9.0.x

Cause

A communication issue (temporary, or otherwise) between the newly deployed SDDC Manager in this new instance is unable to communicate with the Fleet Management appliance in the existing VCF Fleet.

This issue is not present in VCF 9.1 releases

Resolution

1.  Run the following commands to generate a hashed password for your admin@local account in the new instance from an SSH session to the newly deployed SDDC Manager appliance for this new instance:

echo -n 'admin@local:<your_password>' | base64
  • Use the password for your admin@local account in replacement of <your_password>

 

Use the hashed output from the above command in place of <Base64_Encoded_String> in the below command

curl -k -X GET "https://<FleetMgr_appliance_FQDN>/lcm/lcops/api/v2/settings/system-details" \
     -H "Authorization: Basic <Base64_Encoded_String>" \
     -H "Content-Type: application/json" \
   | jq "."

 

  • This will ensure that  the Fleet Management appliance VM is in healthy state and can be connected from SDDC Manager appliance.

 

2. Find the failing task execution id from SDDC Manager/domainmanager.log (might differ depending on the date of the first execution) - should be found as a value of _executionId around the following text:

"Contract task PrepareVcfOperationsManagementImport input: {"_executionId":"####UUID####"}

 

Via curl executed from SDDC Manager appliance verify that the above execution can be queried.

curl localhost/domainmanager/workflows/<workflow_ID> -X GET

 

Via curl in the same session run a PATCH request that should retrigger the execution.

curl localhost/domainmanager/workflows/<workflow_ID> -X PATCH
  • NOTE: To monitor this task, use the above GET command and monitor its output until it reaches a successful state.

Once completed successfully, return the VCF Installer UI and retry the failed deployment task.

Additional Information

For retrying a failed internal workflow because of an invalid input spec.

Similar issue and confirm network communication