Deploying an Edge Cluster in a workload domain fails when at "Verify NSX BGP Peering" step. Further validation confirms that BGP status is 'Active' and connected. Placing NSX in maintenance mode and exiting the maintenance mode, and rebooting the edge nodes doesn't resolve the issue.
This article provides steps to identify the status of NSX BGP Peer and seek assistance with Broadcom Support to complete Edge deployment successfully.
Error Message:
Description: Verify NSX BGP Peering
Progress Messages: Failed to validate BGP Neighbor Peering Status for edge node <IP address>
Error Message: Failed to validate BGP Neighbor Peering Status for edge node <IP address>
Remediation Message:
Reference Token: ####
Cause: Retriable operation 'Check BGP Neighbor Connection state for Edge Neighbors
[TierOBgpNeighborInfo(nsxtld=null, neighborlp=IP address, remoteAsn=42####2, maximumHopLimit=0, sourcelps=[IP address]), TierOBgpNeighborInfo(nsxtld=null, neighborlp=IP address, remoteAsn=42####2, maximumHopLimit=O, sourcelps=[IP address])]' failed to complete after 3 retries. InternalServerError (com.vmware.vapi.std.errors.internal_server_error) => { messages = [], data =struct => {error_message=Internal server error has occurred., details=Request processing failed; nested exception is .ObjectNotFoundException: com.vmware.nsx.management.policy.policyframework.exceptions. Enforcement point connection info not found, error_code=99, module_name=common-services}, errorType = INTERNAL_SERVER_ERROR
Error message as shown in domainmanager.log
/var/log/vmware/vcf/domainmanager/domainmanager.log
YYYY-MM-DDTHH:MIN DEBUG
[vcf_dm,67###
2,18b2]
[c.v.v.c.n.s.c.c.ApiConnection,dm-exec-8] Closed ApiClient connection.
YYYY-MM-DDTHH:MIN
ERROR [vcf_dm,67###
2
,18b2] [c.v.e.s.o.model.error.ErrorFactory,dm-exec-8] [M##5] FAILED_TO_VALIDATE_BGP_NEIGHBOR_PEERING_STATUS Failed to validate BGP Neighbor Peering Status for edge node <IP address>com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException:
Failed to validate BGP Neighbor Peering Status for edge node
<IP address>
at
com.vmware.vcf.common.fsm.plugins.nsxt.policy.action.VerifyBgpPeeringNsxApiAction.execute(VerifyBgpPeeringNsxApiAction.java:137)
at
com.vmware.vcf.common.fsm.plugins.nsxt.policy.action.VerifyBgpPeeringNsxApiAction.execute(VerifyBgpPeeringNsxApiAction.java:39)
at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionState.invoke(FsmActionState.java:62)
at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:159)
at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:144)
at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.invokeMethod(ProcessingTaskSubscriber.java:400)
at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.processTask(ProcessingTaskSubscriber.java:520)
at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.accept(ProcessingTaskSubscriber.java:124)
at jdk.internal.reflect.GeneratedMethodAccessor910.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at com.google.common.eventbus.Subscriber.invokeSubscriberMethod(Subscriber.java:85)
at com.google.common.eventbus.Subscriber.lambda$dispatchEvent$0(Subscriber.java:71)
at com.vmware.vcf.common.tracing.TraceRunnable.run(TraceRunnable.java:59)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
Caused
by: java.lang.RuntimeException: Retriable operation 'Check BGP Neighbor
Connection state for Edge Neighbors[Tier0BgpNeighborInfo(nsxtId=null,
neighborIp=
<IP address>
, remoteAsn=42###2, maximumHopLimit=0, sourceIps=[<IP address>
]), Tier0BgpNeighborInfo(nsxtId=null, neighborIp=<IP address>
, remoteAsn=42###2
, maximumHopLimit=0, sourceIps=[<IP address>
])]' failed to complete after 3 retries. at com.vmware.evo.sddc.common.util.RetriableCallable.call(RetriableCallable.java:222)
at
com.vmware.vcf.common.fsm.plugins.nsxt.policy.action.VerifyBgpPeeringNsxApiAction.execute(VerifyBgpPeeringNsxApiAction.java:132)
... 16 common frames omitted
Caused
by: com.vmware.vapi.std.errors.InternalServerError: InternalServerError
(com.vmware.vapi.std.errors.internal_server_error) => {
messages = [],
data = struct => {error_message=Internal server error has
occurred., details=Request processing failed; nested exception is
com.vmware.nsx.management.policy.policyframework.exceptions.ObjectNotFoundException:
Enforcement point connection info not found, error_code=99,
module_name=common-services},
errorType = INTERNAL_SERVER_ERROR
SDDC Manager 5.x
Stale entries of incomplete tasks in Domain Manager Table of SDDC database.
For further troubleshooting, please raise a Support Case with Broadcom Support post validating the BGP Neighbor status that shows as 'Success
'.