HCX - NE appliance In-Service upgrade workflow may create L2 Loop in the network
search cancel

HCX - NE appliance In-Service upgrade workflow may create L2 Loop in the network

book

Article ID: 321582

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

Identify a known issue with HCX Network Extension In-Service upgrade workflow.

Symptoms:
Customer may experience layer 2 loop in a given segment during HCX Network Extension (NE) appliance upgrade when In-Service option was selected.
Below log can be seen in HCX manager:

2022-11-10 22:58:49.473 UTC [InterconnectService_SvcThread-48, IX:########-####-####-####-########09a9, J:b27948c0, , TxId: ########-####-####-####-########bd1a] WARN c.v.v.h.s.i.DeployAppliance- Unable to set the bridgeStateFlag to down for the appliance ########-####-####-####-########8277. Retrying the operation.


Location of App Engine log:

  • HCX Manager : /common/log/admin/app.log



Cause

This is a timing issue found in HCX Network Extension appliance when using In-Service workflow for upgrade.
Due to certain infrastructure conditions, NE appliance may take longer time to complete the boot-up process.
As a result, HCX manager fails to create "bridgeStateFlag" for newly deployed NE appliance which leads to the new appliance coming with bridge data path in operation, while the old NE Appliance is still in service.
This causes a Layer 2 loop in a given extended datapath which may lasts for few second.

Resolution

This is fixed in HCX 4.5.2 release.

Workaround:
The recommendation is to use Standard Upgrade workflow as an alternative, if NE appliance version needs to be upgraded.
Also, this won't impact HCX NE running in High Availability (HA) pair.

IMPORTANT: NE HA workflow doesn't depend upon In-Service mechanism. We perform failover during upgrade and it won't have a regular downtime. The only downtime expected in HA workflow is during failover.

Additional Information

Impact/Risks:
  • All HCX versions are affected.
  • Network Extension appliance may continue operating using existing version without upgrade.
  • Network extension service will remain active.
  • MON enabled VMs will continue to operate as expected.
  • This behavior will NOT impact upgrade workflow with HCX NE running in HA pair.
  • There will be NO impact to HCX migration services.