NSX Edge deployment fails during BGP Connectivity Verification with an error 'Failed to validate BGP Neighbor Peering Status for Edge Node'
search cancel

NSX Edge deployment fails during BGP Connectivity Verification with an error 'Failed to validate BGP Neighbor Peering Status for Edge Node'

book

Article ID: 387110

calendar_today

Updated On:

Products

VMware SDDC Manager

Issue/Introduction

Deploying an Edge Cluster in a workload domain fails when at "Verify NSX BGP Peering" step.  Further validation confirms that BGP status is 'Active' and connected.  Placing NSX in maintenance mode and exiting the maintenance mode, and rebooting the edge nodes doesn't resolve the issue.

This article provides steps to identify the status of NSX BGP Peer and seek assistance with Broadcom Support to complete Edge deployment successfully.

Error Message:

Description: Verify NSX BGP Peering
Progress Messages: Failed to validate BGP Neighbor Peering Status for edge node <IP address>
Error Message: Failed to validate BGP Neighbor Peering Status for edge node <IP address>
Remediation Message:
Reference Token: ####
Cause: Retriable operation 'Check BGP Neighbor Connection state for Edge Neighbors[TierOBgpNeighborInfo(nsxtld=null, neighborlp=IP address, remoteAsn=42####2, maximumHopLimit=0, sourcelps=[IP address]), TierOBgpNeighborInfo(nsxtld=null, neighborlp=IP address, remoteAsn=42####2, maximumHopLimit=O, sourcelps=[IP address])]' failed to complete after 3 retries. InternalServerError (com.vmware.vapi.std.errors.internal_server_error) => { messages = [], data =struct => {error_message=Internal server error has occurred., details=Request processing failed; nested exception is .ObjectNotFoundException: com.vmware.nsx.management.policy.policyframework.exceptions. Enforcement point connection info not found, error_code=99, module_name=common-services}, errorType = INTERNAL_SERVER_ERROR

Error message as shown in domainmanager.log

/var/log/vmware/vcf/domainmanager/domainmanager.log

YYYY-MM-DDTHH:MIN DEBUG [vcf_dm,67###2,18b2] [c.v.v.c.n.s.c.c.ApiConnection,dm-exec-8]  Closed ApiClient connection.
YYYY-MM-DDTHH:MIN ERROR [vcf_dm,67###2,18b2] [c.v.e.s.o.model.error.ErrorFactory,dm-exec-8]  [M##5] FAILED_TO_VALIDATE_BGP_NEIGHBOR_PEERING_STATUS Failed to validate BGP Neighbor Peering Status for edge node <IP address>
com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException: Failed to validate BGP Neighbor Peering Status for edge node <IP address>
        at com.vmware.vcf.common.fsm.plugins.nsxt.policy.action.VerifyBgpPeeringNsxApiAction.execute(VerifyBgpPeeringNsxApiAction.java:137)
        at com.vmware.vcf.common.fsm.plugins.nsxt.policy.action.VerifyBgpPeeringNsxApiAction.execute(VerifyBgpPeeringNsxApiAction.java:39)
        at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionState.invoke(FsmActionState.java:62)
        at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:159)
        at com.vmware.evo.sddc.orchestrator.platform.action.FsmActionPlugin.invoke(FsmActionPlugin.java:144)
        at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.invokeMethod(ProcessingTaskSubscriber.java:400)
        at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.processTask(ProcessingTaskSubscriber.java:520)
        at com.vmware.evo.sddc.orchestrator.core.ProcessingTaskSubscriber.accept(ProcessingTaskSubscriber.java:124)
        at jdk.internal.reflect.GeneratedMethodAccessor910.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at com.google.common.eventbus.Subscriber.invokeSubscriberMethod(Subscriber.java:85)
        at com.google.common.eventbus.Subscriber.lambda$dispatchEvent$0(Subscriber.java:71)
        at com.vmware.vcf.common.tracing.TraceRunnable.run(TraceRunnable.java:59)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.lang.RuntimeException: Retriable operation 'Check BGP Neighbor Connection state for Edge Neighbors[Tier0BgpNeighborInfo(nsxtId=null, neighborIp=<IP address>, remoteAsn=42###2, maximumHopLimit=0, sourceIps=[<IP address>]), Tier0BgpNeighborInfo(nsxtId=null, neighborIp=<IP address>, remoteAsn=42###2, maximumHopLimit=0, sourceIps=[<IP address>])]' failed to complete after 3 retries.
        at com.vmware.evo.sddc.common.util.RetriableCallable.call(RetriableCallable.java:222)
        at com.vmware.vcf.common.fsm.plugins.nsxt.policy.action.VerifyBgpPeeringNsxApiAction.execute(VerifyBgpPeeringNsxApiAction.java:132)
        ... 16 common frames omitted
Caused by: com.vmware.vapi.std.errors.InternalServerError: InternalServerError (com.vmware.vapi.std.errors.internal_server_error) => {
    messages = [],
    data = struct => {error_message=Internal server error has occurred., details=Request processing failed; nested exception is com.vmware.nsx.management.policy.policyframework.exceptions.ObjectNotFoundException: Enforcement point connection info not found, error_code=99, module_name=common-services},
    errorType = INTERNAL_SERVER_ERROR

Environment

SDDC Manager 5.x

Cause

Stale entries of incomplete tasks in Domain Manager Table of SDDC database.

Resolution

For further troubleshooting, please raise a Support Case with Broadcom Support post validating the BGP Neighbor status that shows as 'Success'.

Follow below mentioned steps to validate:
  1. Log in to NSX-T Manager.
  2. Navigate to "Networking" > "Logical Routers".
  3. Select your Tier-0 Gateway.
  4. Go to the "BGP" section.