Customer configured LACP on a VDS with two uplinks but saw only one uplink some day. In vmkernel log we see such warnings,
2024-05-14T02:23:37.867Z cpu29:2106412 opID=51c538cb)Team.vswitch: TeamVSLACPLAGEventCB:9083: [nsx@6876 comp="nsx-esx" subcomp="vswitch"]Received event UPLINK REMOVE, LAG LAG_OFF_DMZ/-406565621, link UNKNOWN, uplink vmnic12/0x8a000030, link UNKNOWN
2024-05-14T02:23:37.867Z cpu29:2106412 opID=51c538cb)KCPFreeUplink:1186:[nsx@6876 comp="nsx-esx" subcomp="kcp"]Free the uplink: portID[2315255856/86], client[vmnic12], dvsUUID[50 23 1a 67 22 54 cb a9-a5 29 b8 39 d2 d5 9e b4]
2024-05-14T02:23:38.032Z cpu29:2106412 opID=51c538cb)KCPPortStateUpdateForUplink:1663:[nsx@6876 comp="nsx-esx" subcomp="kcp"]Adding uplink: port [2281701511/138] client vmnic12
2024-05-14T02:23:38.037Z cpu56:3685864)netschedHClk: NetSchedHClkWatchdogSysWorld:6522: vmnic12: link up event received, device running at 10000 Mbps so setting queue depth to 86460 bytes with expected 1310 bytes/us
VDS received an event of "UPLINK REMOVE" from network switch so the problem source is the network switch. Two possibilities are,
1. When ESXi and the switch negotiate LACP, the vmnic fails to negotiate
2. The switch perceives an issue with the vmnic, detects anomalies during the LACP negotiation, so removes the vmnic from LACP port-channel.
Need to check status and logs of the port on network switch. If there are changes in the port members, VDS will adjust accordingly.