When creating an Edge cluster through SDDC Manager fails at "Verify NSX-T Data Center BGP Peering".
search cancel

When creating an Edge cluster through SDDC Manager fails at "Verify NSX-T Data Center BGP Peering".

book

Article ID: 316035

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

Symptoms:

  • The Edge Cluster creation workflow fails and the SDDC manager UI Task view indicates that it failed at "Verify NSX-T Data Center BGP Peering" action. 
  • The domain manager logs at /var/log/vmware/vcf/domainmanager/domainmanager.log indicate that 'get bgp neighbor' failed to elicit any BGP neighbors for an Edge node.
  • Command timed out: false
    2020-04-15T00:24:18.393+0000 DEBUG [74125e06972b5c5d,ff90] [c.v.e.s.common.util.SSHCommandUtil,dm-exec-1]  Executed command "vrf 1; get bgp neighbor | json; exit" against x.y.x.xx:22. Exit status: 4, output: %
    2020-04-15T00:24:18.394+0000 ERROR [74125e06972b5c5d,ff90] [c.v.e.s.o.model.error.ErrorFactory,dm-exec-1]  [Reference Token] FAILED_TO_CONNECT_TO_NSXT_EDGE Failed to connect to NSX-T edge.
    com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException: Failed to connect to NSX-T edge.
            at com.vmware.vcf.common.fsm.plugins.nsxt.action.VerifyBgpPeeringAction.execute(VerifyBgpPeeringAction.java:95)
            at com.vmware.vcf.common.fsm.plugins.nsxt.action.VerifyBgpPeeringAction.execute(VerifyBgpPeeringAction.java:32)
  • SDDC manager gives a message in the GUI that looks like the following:

    Check BGP Neighbor Connection state for Edge Neighbors[Tier0BgpNeighborInfo(nsxtId=x.x.xx.xx_12345, neighborIp=x.x.xx.sy, remoteAsn=12345, maximumHopLimit=1, sourceIps=[x.x.x.yy, x.x.x.yx]), Tier0BgpNeighborInfo(nsxtId=x.x.x.gh_12345, neighborIp=x.x.xx.yy, remoteAsn=12344, maximumHopLimit=1, sourceIps=[x.x.x.yy, x.x.x.yx])]' failed to complete after 3 retries. InternalServerError (com.vmware.vapi.std.errors.internal_server_error) => { messages = [], data = struct => {error_message=Internal server error has occurred., details=Request processing failed; nested exception is com.vmware.nsx.management.policy.policyframework.exceptions.ObjectNotFoundException: Enforcement point connection info not found, error_code=99, module_name=common-services}, errorType = INTERNAL_SERVER_ERROR }

    To most users this appears to be an error message coming back from NSX-T to SDDC manager, but it may be an issue within SDDC manager.

  • Example of the domainmanager.log output

    Caused by: java.lang.RuntimeException: Retriable operation 'Check BGP Neighbor Connection state for Edge Neighbors[Tier0BgpNeighborInfo(nsxtId=null, neighborIp=xx.xx.xx.xx, remoteAsn=15001, sourceIps=[xx.xx.xx.xx]), Tier0BgpNeighborInfo(nsxtId=null, neighborIp=xx.xx.xx.xx, remoteAsn=15001, sourceIps=[xx.xx.xx.xx])]' failed to complete after 3 retries.
            at com.vmware.evo.sddc.common.util.RetriableCallable.call(RetriableCallable.java:196)
            at com.vmware.vcf.common.fsm.plugins.nsxt.policy.action.VerifyBgpPeeringNsxApiAction.execute(VerifyBgpPeeringNsxApiAction.java:132)
            ... 17 common frames omitted
    TimeStamp DEBUG [vcf_dm,f9f2f70fc9174e70,4fb2] [c.v.e.s.o.c.ProcessingTaskSubscriber,dm-exec-20]  Collected the following errors for task with name VerifyBgpPeeringNsxApiAction and ID Example ID: [ExecutionError [errorCode=null,

Environment

VMware Cloud Foundation 4.x

VMware Cloud Foundation 5.x

Resolution

This is a known issue with NSX-T 3.x and NSX-T 4.x and VMware Cloud Foundation

Workaround:
To workaround this issue, either:

  • Disable/ Re-enable maintenance-mode on the Edge.
  • Reboot the Edge VM.

To enable and disable maintenance-mode, navigate through the NSX-T UI 

  • System > Fabric > Nodes > Edge Transport Nodes > Select Edge transport Node > Actions > Enter NSX Maintenance mode
  • System > Fabric > Nodes > Edge Transport Nodes > Select Edge transport Node > Actions > Exit NSX Maintenance mode

If the workaround doesn't work, log a ticket with GSS through the Broadcom portal and reference this KB.