The validation stage of the Add VxRail cluster wizard fails to launch. A red banner error message reports "Error in validating VxRail payload. Task not found."
The domainmanger.log on the SDDC Manager details a certificate issue when the SDDC is trying to use a VxRail proprietary API (GET https://<vxrManager_FQDN>:443/rest/vxm/v2/system/initialize/nodes) on the VxRail manager itself:
2023-09-18T10:06:27.597+0000 INFO [vcf_dm,01d66a7853a545d4,2316] [c.v.e.s.c.v.VxRailManagerService,dm-exec-19] Fetching nodes discovered and managed by VxRail Manager VXRAILMANAGER.lab using endpoint /rest/vxm/v2/system/initialize/nodes 2023-09-18T10:06:27.597+0000 DEBUG [vcf_dm,01d66a7853a545d4,2316] [c.v.v.secure.http.HttpClientService,dm-exec-19] Starting GET request from host: VMVXWD2VXMGR01.ocb.lab, port: 443, isSecure: true, path: /rest/vxm/v2/system/initialize/nodes, queryParamMap: null, headers: {Accept=application/json, Content-Type=application/json} 2023-09-18T10:06:27.598+0000 DEBUG [vcf_dm,01d66a7853a545d4,2316] [c.v.v.secure.http.HttpClientService,dm-exec-19] Making request: GET https://VXRAILMANAGER.lab:443/rest/vxm/v2/system/initialize/nodes 2023-09-18T10:06:27.613+0000 DEBUG [vcf_dm,01d66a7853a545d4,8fe8] [c.v.v.s.t.DynamicTrustManager,dm-exec-19] Checking validity of certificate chain CN=VxRail, OU=VxRailApplianceServer, O=VMware, L=vsphere, ST=local, C=US 2023-09-18T10:06:27.617+0000 DEBUG [vcf_dm,01d66a7853a545d4,8fe8] [c.v.v.s.t.DynamicTrustManager,dm-exec-19] Error checking certificate chain CN=VxRail, OU=VxRailApplianceServer, O=VMware, L=vsphere, ST=local, C=US for validity. sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at java.base/sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:439)
Notice that the VxRail manager shortname is in uppercase (VXRAILMANAGER.lab)
Immediately after this the domainmanger logs record the failure we see in the UI:
2023-09-18T10:08:25.975+0000 DEBUG [vcf_dm,8ae3fe38c9994ea1,f3e3] [c.v.e.s.c.s.a.i.InventoryServiceAdapterImpl,http-nio-127.0.0.1-7200-exec-7] Fetching NSX Clusters data from inventory 2023-09-18T10:08:26.027+0000 INFO [vcf_dm,675a66e8151a4ec4,5a6c] [c.v.v.v.c.v1.DomainController,http-nio-127.0.0.1-7200-exec-9] Getting validation with id: 6812af1a-4d5c-4184-a798-fde4c68c5a3e 2023-09-18T10:08:26.030+0000 ERROR [vcf_dm,675a66e8151a4ec4,5a6c] [c.v.e.s.o.d.i.OrchestratorDataImpl,http-nio-127.0.0.1-7200-exec-9] In OrchestratorDataImpl, execution does not exist 2023-09-18T10:08:26.031+0000 ERROR [vcf_dm,675a66e8151a4ec4,5a6c] [c.v.e.s.o.d.i.OrchestratorDataImpl,http-nio-127.0.0.1-7200-exec-9] In OrchestratorDataImpl, failed to get execution com.vmware.evo.sddc.orchestrator.model.error.ExecutionException: null at com.vmware.evo.sddc.orchestrator.dal.impl.OrchestratorDataImpl.getExecutionContext (OrchestratorDataImpl.java:863)
Running nslookup against the VxRail Manager IP we see that the FQDN is returned in all lowercase.
Checking the certificate on the VxRail manager shows the FQDN in uppercase.
This issue can be caused by a difference in case between the DNS recorded FQDN of the VxRail manager and the FQDN as recorded in the VxRail manager's certificate.
NOTE: Treat the case returned by nslookup as the case required in the wizard and the VxRail manager certificate.