While adding a new ESXi host to the existing cluster failure was observed with following logs from vpxd.log
####-##-##T##:##:##.###Z info vpxd[809151] [Originator@6876 sub=vpxLro opID=73596264-78] [VpxLRO] -- BEGIN lro-710555554 -- AuthorizationManager -- vim.Aut
horizationManager.hasPrivilegeOnEntities -- 52b88897-b678-fccc-bc78-ea75f9476370(5256cd77-9ae6-d0c8-c664-f090c9567f1b)
####-##-##T##:##:##.###Z info vpxd[809151] [Originator@6876 sub=vpxLro opID=73596264-78] [VpxLRO] -- FINISH lro-710555554
####-##-##T##:##:##.###Z error vpxd[809162] [Originator@6876 sub=DAS opID=########-#######-####-#####-##] Timed out waiting for election to complete or for host to join existing master
####-##-##T##:##:##.###Z error vpxd[809162] [Originator@6876 sub=DAS opID=########-#######-####-#####-##] EnableDAS failed on host [vim.HostSystem:####-#####,xxxxxxxxxxxxxxxxxxxx]: N3Vim5Fault8Timedout9ExceptionE(Fault cause: vim.fault.Timedout
--> )
-->
####-##-##T##:##:##.###Z info vpxd[809162] [Originator@6876 sub=MoHost opID=########-#######-####-#####-##] VC state for host ####-##### (uninitialized -> retry), FDM state (UNKNOWN_FDM_HSTATE -> UNKNOWN_FDM_HSTATE), src of state (null -> null)
####-##-##T##:##:##.###Z info vpxd[809162] [Originator@6876 sub=vpxLro opID=########-#######-####-#####-##] [VpxLRO] -- FINISH lro-710546293
####-##-##T##:##:##.###Z error vpxd[809162] [Originator@6876 sub=Default opID=########-#######-####-#####-##] [VpxLRO] -- ERROR lro-710546293 -- -- -- DasConfig.ConfigureHost: :vim.fault.Timedout
--> Result:
--> (vim.fault.Timedout) {
--> faultCause = (vmodl.MethodFault) null,
--> faultMessage = <unset>
--> msg = ""
--> }
- Restarting the management agents, removing the host from the inventory and adding it back to the inventory did not fix the issue.
- MTU was configured properly.
- DNS records created properly - Forward and reverse lookup zones configured correctly.
- Hosts are able to communicate on port 8182.
vCenter 7.x
ESXi 7.x
The HA configuration on a ESXi host will timeout and the HA status changes to "retrying"
All the points mentioned in this KB318936 were verified.
If the ESXi hosts are prepared for NSX, check the host status in NSX manager. If the host status says "validation error", fix this issue first on NSX manager and then try to reconfigure HA on the host.