T1 status shows "Down" after NSX Edge upgrade is completed
search cancel

T1 status shows "Down" after NSX Edge upgrade is completed

book

Article ID: 401502

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Issue is noticed during NSX upgrade from 3.2.x to 4.2.x
  • Once the Edge nodes are upgraded successfully, T1 status shows Down.
  • Status of both Active and Standby Edges show Success with No Errors:
  • NSX Manager syslog.log file shows the below l2AppUfo events being triggered multiple times : 
2025-xx-xxTxx:xx:31.826Z XXXXNSXXXX03 NSX 1493 - [nsx@6876 comp="nsx-controller" level="INFO" subcomp="l2AppUfo"] Event DomainObjectRegistryChangedEventImpl(added: [], updated: [LogicalSwitchPortBindings(xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.bindings)], removed: []) updated fields: {LogicalSwitchPortBindings(xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.bindings)=[vmware.nsx.ufostore.ccp.ipdiscovery.LogSwitchPortBindingsMsg.lsp_binding Value ip_address {#012  ip_address {#012    ipv4: xxxxxxxx00#012  }#012  prefix_length: 32#012}#012mac_address {#012  mac: xxxxxxxxxx07#012}#012discovered_binding_type: ADDRESS_BINDING_TYPE_ARP_SNOOPING#012update_time: xxxxxxxxxx753#012vlan_id: xxx#012is_active: true#012 operation ADDED]} added Relations: {} removed Relations: {} failed to find or create logical entry LogicalSwitchId:[id=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, vni=-1]
  • After the above entries, you see events related to T1 status going down : 
2025-xx-xxTxx:xx:15.968Z XXXXNSXXXX03 NSX 4418 - [nsx@6876 audit="true" comp="nsx-manager" level="INFO" subcomp="manager"] UserName="[email protected]", ModuleName="API", Operation="GET@/api/v1/infra/tier-1s/XXX-XXX-XXX-T1-XX-XX/status", Operation status="failure"

Environment

VMware NSX T Datacenter
VMware NSX

Cause

The issue is caused by an incomplete or unsuccessful realization of Tier-1 gateways on one of the Edge nodes following the upgrade. During the upgrade process, certain Tier-1 gateways that had an active Edge node did not properly reinitialize their data plane components. As a result, these gateways failed to reach a realized state and entered a failure condition.

Resolution

Workaround : 

  • Place the Active Edge in Maintenance Mode and Exit out of Maintenance Mode.

This helps in re-realization of the T1 which will bring the T1 status back to a Success state.