NSX-T Bare Metal Edge upgrade caused tunnels down
search cancel

NSX-T Bare Metal Edge upgrade caused tunnels down

book

Article ID: 394670

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • When upgrading NSX-T environment with Bare Metal Edges that utilize bond for management interface, the upgrade coordinator may show that the Bare Metal Edge as "Failed" after reaching the last step of the upgrade. 

    The error message indicates tunnels status on the edge is down:
    Edge node <Edge UUID>, has errors Errors = ["moduleName":"upgrade-coordinator", "errorCode":30212, "errorMessage":"Overall status of the edge transport node <Edge UUID> is DOWN."}, {"moduleName": "upgrade-coordinator","errorCode":30209,"errorMessage":"Tunnel status of the edge transport node <Edge UUID> is DOWN."), ] after state sync wait.
  • When using get upgrade progress-status command on the Bare Metal Edge to check upgrade progress, all upgrade process are sucessful. 
  • TEP tunnels status for this Bare Metal Edge node show "Down" for all peers. 
  • When pinging from the Bare Metal Edge's VRF 0 interface, none of the peers are reachable. 
  • When checking the bond status, there may be only one member listed:
    bme> get interrace bond0
    **************************************************************************************************************************************************************
    Node Upgrade has been started. Please do not make any changes, until the upgrade operation is complete. Run "get upgrade progress-status"
    to show the progress of last upgrade step.
    **************************************************************************************************************************************************************
    Thu Apr 17 2025 EDT 12:25:49.154
    Interface: bond0
    Address: <edge-management-IP>
    MAC address: <interface-MAC address>
    MTU: 1500
    Default gateway: <default-gateway>
    Broadcast address: <broadcast-address>
    Plane: mgmt
    KNI: False
    Bond mode: ACTIVE_BACKUP
    Bond primary slave: None
    Bond currently active slave: eth#
    Bond slaves: eth#
    Link status: up
    Admin status: up
  • The missing bond slave may have a new name such as "eth#-mlx". 


Environment

VMware NSX-T Data Center

Cause

During upgrade of NSX-T Bare Metal Edge, the bond slaves were not properly configured and one of the bond members was removed and incorrectly used for dataplane service.

Resolution

This issue is resolved in VMware NSX-T Data Center 3.2.2, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

 

Workaround: 

  1. Stop the dataplane service on this edge, there maybe traffic disruption:
    bme>stop service dataplane
  2. Add all the members back to the bond:
    bme>set interface bond# ip <management-IP> gateway <gateway> plane mgmt mode active-backup members eth#,eth# primary eth# 

    An example: set interface bond0 ip 192.168.1.100/24 gateway 192.168.1.1 plane mgmt mode active-backup memebers eth0,eth1 primary eth0

  3. Start the dataplane service:
    bme>start service dataplane

  4. Retry the upgrade.