Adding the host is taking a long time as we are fetching portgroups information for hosts from vCenter by prefix and For vSAN and vMotion transport type, portgroup names do not match exactly, due to which we traverse over all portgroups for each host in vCenter resulting in the workflow taking a huge amount of time (resulting in hours/days based on the number of portgroups).
In Domain Manager logs, We can observe a similar log as shown below:
INFO [vcf_dm,6808f6927e9a29749192c6b73fa04f17,2792] [c.v.v.v.p.service.HostSystemPlugin,dm-exec-11] host name <sample_host_name.domain.local>
INFO [vcf_dm,6808f6927e9a29749192c6b73fa04f17,2792] [c.v.v.v.p.service.HostSystemPlugin,dm-exec-11] value of nic size for host <sample_host_name.domain.local>, 4, [NicInfo(name=management.key-vim.host.VirtualNic-vmk2, type=management, ipAddress=XX.XX.XX.XX, defaultGateway=XX.XX.XX.XX, subnetMask=XX.XX.XX.XX, portGroup=Management Network-XXXX, portGroupKey=dvportgroup-XX), NicInfo(name=management.key-vim.host.VirtualNic-vmk0, type=management, ipAddress=, defaultGateway=XX.XX.XX.XX, subnetMask=, portGroup=VxRail Management-XX.XX.XX.XX, portGroupKey=dvportgroup-XX), NicInfo(name=vmotion.key-vim.host.VirtualNic-vmk4, type=vmotion, ipAddress=XX.XX.XX.XX, defaultGateway=XX.XX.XX.XX, subnetMask=XX.XX.XX.XX, portGroup=null, portGroupKey=null), NicInfo(name=vsan.key-vim.host.VirtualNic-vmk3, type=vsan, ipAddress=XX.XX.XX.XX, defaultGateway=XX.XX.XX.XX, subnetMask=XX.XX.XX.XX, portGroup=null, portGroupKey=null)]
The log message indicates that the portGroup is null for both vSAN and vMotion types. As a result, the system traverse through all portgroups, which is causing the delay.
VCF on VxRail 5.X
Add host is fetching portgroups information for hosts by prefix and the reason that the prefix does not match is customer has modified portgroup names for vSAN and vMotion type after adding the host to VxRail.
Currently, the code fetches host portgroups information from vCenter using a prefix. The workaround below will modify it to fetch portgroups by MoR instead of by-prefix, and then test the workflow.
Workaround Steps :
curl http://localhost:7200/domainmanager/features/list | json_pp | grep -E 'feature.vxrail.cluster.discovery.optimisation.queryPGByPrefixStrategy|feature.vxrail.cluster.discovery.optimisation.queryPGByMoRStrategy'
output:"feature.vxrail.cluster.discovery.optimisation.queryPGByMoRStrategy" : "false","feature.vxrail.cluster.discovery.optimisation.queryPGByPrefixStrategy" : “true”,vi feature.propertiesfeature.vxrail.cluster.discovery.optimisation.queryPGByMoRStrategy=truefeature.vxrail.cluster.discovery.optimisation.queryPGByPrefixStrategy=false-rw-r--r-- 1 vcf vcf 203 May 5 00:00 feature.properties
curl http://localhost:7200/domainmanager/features/list | json_pp | grep -E 'feature.vxrail.cluster.discovery.optimisation.queryPGByPrefixStrategy|feature.vxrail.cluster.discovery.optimisation.queryPGByMoRStrategy'
Output