Unable to add VxRail Cluster to WLD, Error: Failed to load the cluster details. Failed to fetch the response due to some internal error
search cancel

Unable to add VxRail Cluster to WLD, Error: Failed to load the cluster details. Failed to fetch the response due to some internal error

book

Article ID: 322237

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

This article is valid in below scenarios :
  • The workflow is initiated from SDDC to create a new WLD cluster and fails with the above errors during host addition.
  • The hosts in one of the neighboring workload domain clusters are powered off or has reachability errors


Symptoms:
  • Adding host to VxRail cluster fails with error - Failed to load Host cluster details. Invalid parameter: (0) 
  • On SDDC GUI, the Task fails at "Gather input to add host to NSX-T Fabric"
  • Hosts are stuck at Activating state
  • Log file, Domainmanager.log will show entries similar to below snippets :
2021-10-25T16:56:33.798+0000 INFO [vcf_dm,e73acd699f72ea18,d397] [c.v.v.v.c.f.a.VxRailNsxtAddClusterHeader,dm-exec-3] Host esx10.vmware.local with id 56b1d8fd-553d-17y3-9eab-4ad2bd483078 is picked-up for determining NSX-T environment

2021-10-25T16:56:34.417+0000 ERROR [vcf_dm,e73acd699f72ea18,d397] [c.v.v.v.c.f.a.VxRailNsxtAddClusterHeader,dm-exec-3] Error occurred while generating input for add hosts in cluster in nsxt environment

2021-10-25T18:44:33.345+0000 INFO [vcf_dm,013ccaf18f704d6b,7fb9] [c.v.v.c.n.s.c.c.NsxtManagerClusterOperations,dm-exec-8] Retrieved cluster nodes from NSX-T manager {192.168.1.24=32ea1142-05a4-3e45-6d82-62d8a337cdc4, 192.168.1.100=66451142-0599-2c60-8d29-0218ecaec8bd, 192.168.1.103=febf1142-8e61-938d-43cd-87ae8ddc78ef}
2021-10-25T18:44:33.346+0000 DEBUG [vcf_dm,013ccaf18f704d6b,7fb9] [c.v.v.c.f.p.n.h.NsxtDomainResourceStateHelper,dm-exec-8] Nsxt manager VM Ip's are: [192.168.1.101, 192.168.1.102, 192.168.1.103]
2021-10-25T18:44:33.346+0000 DEBUG [vcf_dm,013ccaf18f704d6b,7fb9] [c.v.v.c.f.p.n.h.NsxtDomainResourceStateHelper,dm-exec-8] Error occurred while retrieving NSX-T management vms by IP address
java.lang.NullPointerException: null


Environment

VMware Cloud Foundation on VxRail 4.2
VMware Cloud Foundation 4.x

Cause

The new host addition workflow always fetches Portgroup information of hosts from the neighboring clusters and the workflow completes if there are no reachability errors to the other hosts. This issue observed if any of the neighboring cluster hosts are poweredoff/hardwarefault (or) if the host FQDN is not reachable.

Resolution

This is a known issue affecting VMware Cloud Foundation on VxRail. Currently, there is no resolution.

Please refer to the workaround section to bypass the error.

Workaround:
If the hosts can be Powered On :
  • Power on all the hosts in the neighboring clusters.
  • Make sure all the ESXi hosts FQDN are reachable.
  • Go to SDDC Manager and retry the failed Add Host workflow.

If the hosts cannot be Powered On :

Perform below steps to remove the disconnected hosts from DVS and complete the Add Host workflow :

Please note, the VxRail environment will have two DVS :
 - DVS created during VxRail automation process and is created for management purposes
 - DVS created as part of VCF setup to create NSX-T
  • Remove the powered off hosts from the connected DVS through vCenter UI. (Remove only the VxRail created DVS from the host)
  • Go to SDDC Manager and retry the failed Add Host workflow.
To add the Powered Off / Disconnected hosts back to the DVS at a later point :
  • Power on the hosts and Add the hosts back to the DVS from the vCenter Server UI (Note : you cannot get the disconnected/offline hosts back to DVS unless they are powered on and connected to DVS)
NOTE: When the host is powered off and from vCenter if the DVS connection is removed, the DVS related details are wiped off only from the vCenter inventory. When the same hosts are powered on, it will retain the original configuration and should be able to join back to the management network, however will not able to do any changes on the network stack. So, the DVS config needs to be pushed back to the host.