VM Network Ports Blocked after a HA failover of VMs on one side of a stretched cluster.
search cancel

VM Network Ports Blocked after a HA failover of VMs on one side of a stretched cluster.

book

Article ID: 434615

calendar_today

Updated On:

Products

VMware NSX VMware vSphere ESXi

Issue/Introduction

In a stretched cluster environment, following a complete power outage at one site (Site B), Virtual Machines (VMs) successfully failed over to the remaining site (Site A) via vSphere HA. However, several VMs experienced a total loss of network connectivity upon powering on.

Symptoms:

  • User VMs are powered on at the other site but inaccessible via the network. You see alarms similar to the following in the vCenter UI which confirm that they were restarted by HA:

    Alarm 'vSphere HA virtual machine failover failed' changed from Red to Green","02/10/2026, 9:26:26 AM","","<VM NAME>","vim.event.AlarmStatusChangedEvent"

    Virtual machine on <ESX HOST FQDN> has powered on","02/10/2026, 9:26:18 AM","","<VM NAME>","vim.event.VmPoweredOnEvent"

  • Around the same time 'HA failover alarms' are reported for two or more of the NSX Manager VMs, which are now powered-on at the other site. It can be seen in the host /var/run/log/vmkernel.log that the NSX management and control plane was temporarily down on the hosts while the Managers completed their system boot:

    Wa(180) nsx-proxy[XXXXXXX]: NSX XXXXXXX - [nsx@6876 comp="nsx-esx" subcomp="nsx-proxy" s2comp="nsx-monitoring" entId="<UUID>" tid="XXXXXXX" level="WARNING" eventState="On" eventFeatureName="communication" eventSev="warning" eventType="control_channel_to_manager_node_down"] The Transport node <TN UUID> control plane connection to Manager node <MANAGER IP> is down for at least 3 minutes from the Transport node's point of view.
  • On the affected host in the /var/run/log/vmkernel.log there are logs reporting that there is a failure when trying to unblock the port for one of the affected user VMs. Eventually, the port is reported as blocked:

    In(182) vmkernel: cpu8:45690423)Calling get restore for portID <SWITCH PORT ID>
    In(182) vmkernel: cpu30:45690643)vswitch: VSwitchPortEthFRPUpdateInt:6314: [nsx@6876 comp="nsx-esx" subcomp="vswitch"]Unblock Port <SWITCH PORT ID>
    In(182) vmkernel: cpu30:45690643)swsec: SwSec_CreateFilter:333: [nsx@6876 comp="nsx-esx" subcomp="swsec-xxxxxxxx"]Failed to create DpCtrPage on port <SWITCH PORT ID>
    .
    .
    In(182) vmkernel: cpu57:2098577)kcp: KCP_DeletePort:958: [nsx@6876 comp="nsx-esx" subcomp="kcp"]Port <SWITCH PORT ID> is cleared and blocked
    In(182) vmkernel: cpu57:2098577)NetPort: 3130: blocking traffic on DV port <DV PORT ID>
  • Just before the blocked message in the /var/run/log/nsx-syslog.log, there is a message indicating that the host received a port deletion message from the CCP (Central Control Plane) for the DV port ID of the user VM:

    In(182) cfgAgent[XXXXXXX]: NSX XXXXXXX - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="XXXXXXX" level="info"] Decoder: Received LOG_SWITCH_PORT_CONFIG msg (Operation CLEAR): <DV PORT ID> 
  • On one of the NSX Managers in the /var/log/proton/nsxapi.log there will be a message indicating that there is a delayed deletion task scheduled for that DV PORT ID:

    INFO L2TaskExecutor3 LogicalPortServiceImpl 5264 SWITCHING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Schedule a delayed deletion task for logical port LogicalPort [id=<DV PORT ID>, intentPath=/infra/segments/<SEGMENT NAME>/ports/default:<DV PORT ID>, logicalPortState=UP, ephemeral=true, logicalSwitchId=LogicalSwitch/<LS UUID>, transportZoneId=TransportZone/<TZ UUID, transportZoneType=VLAN, attachmentId=<ATTACHMENT UUID>, attachmentType=vif, internalPortAttachment=null, switchingProfileIds=[], switchMode=STANDARD, extraConfigs=null, systemExtraConfigs=null, internalId=, initState=null, tags=null, pendingConfigFromHostd=true, addressBindings=null, ignoreAddressBindings=null, isIndependentVifPort=false, context=null, isEsxVmk=false, isDvport=false, skipDefaultProfiles=[]], expected starting time = 1770712015754.

    INFO L2DelayedLogicalPortDeletionScheduler1 LogicalPortServiceImpl 5264 SWITCHING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Deleting delayed logical port InternalLogicalPort/<DV PORT ID> since it's ephemeral and no more attacher

  • But around the same time on one of the NSX Managers in the /var/log/proton/nsxapi.log, a conflicting message indicates a new attach\create port request for the affected user VM from its new host ID. This fails as the port is marked for deletion:

    INFO L2TaskExecutor3 SegmentPortServiceImpl 5261 POLICY [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] CreateOrReplace SegmentPort: SegmentPort{portAttachment=PortAttachment{vifUUID='<VIF UUID>', parentVifUuid='null', portType=null, contextId='null', appId='null', allocateAddressType=null, evpnVlans=null, evpnParent=false, ccpVlanId=0, contextType=null, hyperbusMode=DISABLE, attachedInterfaceEntry=null}, internalPortAttachment=null, addressBindingEntryList=null, adminState=UP, hasAdminState=null, extraConfigs=null, hasExtraConfigs=null, systemExtraConfigs=null, ignoreAddressBindings=null, hasIgnoreAddressBindings=null, initState=null, hasInitState=null, previousTags=null, autoDiscovered=true, internalUpdate=false, ephemeral=true, addedTags=null, vtepGroupId=null, internalKey=<DV PORT ID>, policyPath= /infra/segments/<SEGMENT NAME>/ports/default:<DV PORT ID>, originId= null, isEsxVmk= false, isVifMovingToNewLs= false, logicalSwitchId= <LS SWITCH ID>, resourceOrigin= POLICY, transportZoneId= null} [policyPath=/infra/segments/<SEGMENT NAME>/ports/default:<DV PORT ID>, markedForDelete=false]
    .
    .
    WARN L2TaskExecutor3 PolicyServiceImpl 5261 POLICY [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="manager"] Invalid operation. Entity /infra/segments/<SEGMENT NAME>/ports/default:<DV PORT ID> marked for delete
    WARN L2TaskExecutor3 PolicyUfoUtils 5261 SWITCHING [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="manager"] Exception while calling createLogicalPort com.vmware.nsx.management.common.exceptions.InvalidArgumentException: An object with the same path=[/infra/segments/<SEGMENT NAME>/ports/default:<DV PORT ID>] is marked for deletion. Either use another path or wait for the purge cycle (max 5 minutes) for permanent removal of the object.

Environment

VMware NSX

VMware vSphere ESX

Cause

  • This issue occurs due to a race condition in the NSX Management Plane during the VM HA failover process.
  • When a site failure triggers an HA restart of both workload VMs and NSX Manager VMs on a new site, port attach calls fail to reach the NSX Manager while it is still initializing. These ports are marked as pending_attach and are scheduled for resync after a 60-minute interval.
  • If the original site recovers before this resync occurs, the recovered hosts send detach messages for those VMs to the NSX Manager. Because the manager does not yet see the active attachers (which are still in pending_attach state), it marks the logical ports for deletion. Subsequent attempts by the new host to synchronize and attach these ports are rejected by the NSX Manager because the object is already marked for deletion.

    WARN Exception: An object with the same path is marked for deletion

Resolution

If you encounter this issue following a site disaster recovery event, use one of the following workarounds to force a port refresh:

  1. Perform a manual vMotion: Migrate the affected VM to another host in the cluster. This triggers a new port creation request that bypasses the previous state conflict.

  2. Toggle the vNIC: In the vCenter UI, edit the VM settings to disconnect the vNIC, save the settings, and then reconnect the vNIC.

Note: There is currently no preventative configuration for this behavior in existing versions. A feature improvement is under development for a future 9.X release to allow Transport Nodes to operate independently of the Management Plane during these transitions, preventing such race conditions.

Additional Information

Subscribe to this knowledge article to get updates on this issue.