One uplink in a LAG is removed unexpectedly

search cancel

One uplink in a LAG is removed unexpectedly

book

Article ID: 368463

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Customer configured LACP on a VDS with multiple uplinks intermittently loses access to one or more uplinks
Within vmkernel.log we see such warnings,

/var/run/log/vmkernel.log

YYYY-MM-DDTHH:MM:SS.SS cpu29:2106412 opID=51c538cb)Team.vswitch: TeamVSLACPLAGEventCB:9083: [nsx@6876 comp="nsx-esx" subcomp="vswitch"]Received event UPLINK REMOVE, LAG LAG_OFF_DMZ/-406565621, link UNKNOWN, uplink vmnic12/0x8a000030, link UNKNOWN
YYYY-MM-DDTHH:MM:SS.SS cpu29:2106412 opID=51c538cb)KCPFreeUplink:1186:[nsx@6876 comp="nsx-esx" subcomp="kcp"]Free the uplink: portID[##########/###], client[vmnic12], dvsUUID[<UUID>]
YYYY-MM-DDTHH:MM:SS.SS cpu29:2106412 opID=51c538cb)KCPPortStateUpdateForUplink:1663:[nsx@6876 comp="nsx-esx" subcomp="kcp"]Adding uplink: port [##########/###] client vmnic12
YYYY-MM-DDTHH:MM:SS.SS cpu56:3685864)netschedHClk: NetSchedHClkWatchdogSysWorld:6522: vmnic12: link up event received, device running at 10000 Mbps so setting queue depth to 86460 bytes with expected 1310 bytes/us

lacp.log will also show the uplink being removed:

/var/run/log/lacp.log

YYYY-MM-DDTHH:MM:SS.SS No(29) lacp[2098552]: 2002, Detach uplink vmnic12 from aggregator 6, numPorts 1, agg ##:##:##:##:##:##
YYYY-MM-DDTHH:MM:SS.SS No(29) lacp[2098552]: 2002, Detach uplink vmnic12 from aggregator 6, numPorts 1, agg ##:##:##:##:##:##
YYYY-MM-DDTHH:MM:SS.SS No(29) lacp[2098552]: 2002, Detach uplink vmnic12 from aggregator 6, numPorts 1, agg ##:##:##:##:##:##

vCenter shows repeated events:

LACP info: uplink vmnic# on VDS DvsPortset-# got connected"

Environment

7.x
8.x

Cause

VDS received an event of "UPLINK REMOVE" from upstream network switch due to one of the following:

During ESXi and switch LACP negotiation, the vmnic fails to negotiate
The switch perceives an issue with the vmnic, detects anomalies during the LACP negotiation and removes the vmnic from LACP port-channel.
A layer 1 issue (bad cable, faulty port, etc) is causing port connectivity issues

Resolution

Review logging on upstream physical device where interfaces are configured for bonding and ensure configuration matches that of ESXi
Identify and correct any issues logged with the ports associated with the LACP group
Test and repair any layer 1 infrastructure issues

Additional Information

For more information, refer : Host requirements for link aggregation (etherchannel, port channel, or LACP) in ESXi

Feedback

thumb_up Yes

thumb_down No