vCenter alarms do not trigger when LACP LAG member connections fail
search cancel

vCenter alarms do not trigger when LACP LAG member connections fail

book

Article ID: 410285

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

When you configure Link Aggregation Control Protocol (LACP) on your vSphere Distributed Switch, vCenter Server's standard uplink redundancy alarms do not alert you when individual LAG member connections fail. You only receive notifications when the entire LAG fails completely. This monitoring gap prevents you from identifying degraded network performance before total connectivity loss occurs.

Environment

vCenter Server 7.0 and newer managing ESXi hosts with Link Aggregation Groups (LAG) on distributed switches.

Or, in other words, managing ESXi hosts with LACP configurations on vSphere Distributed Switches

Cause

vCenter Server's alarm framework treats Link Aggregation Groups as single logical uplinks. The standard uplink redundancy alarms evaluate only complete LAG failure, not partial member disconnections. The system design assumes redundancy remains intact when any LAG member stays active.

Resolution

  1. Create a custom alarm to monitor LACP LAG member failures:

    1. Log in to the vSphere Client and navigate to your vCenter Server inventory.

    2. Select the Host, Cluster, or vCenter Server object where you want to configure the alarm.

    3. Click the Configure tab and select Alarm Definitions from the left panel.

    4. Click the ADD button at the top of the Alarm Definitions panel.

    5. Configure the basic alarm settings:

      • Name: Enter LACP LAG Member Down Alert
      • Description: Enter Monitors when a LAG member connection drops
      • Target type: Select Host
      • Enable this alarm: Ensure the checkbox is selected
    6. Configure the alarm trigger in the IF section:

      1. Click to add a trigger condition
      2. Search for and select uplink transition down
      3. This trigger appears in the "Others" category
    7. Set the alarm severity in the THEN section:

      1. Click select severity dropdown
      2. Choose Show as Warning for initial monitoring
      3. Alternative: Select Show as Critical for immediate attention
    8. Configure alarm actions (optional):

      1. Toggle Send email notifications to ON if SMTP is configured
      2. Toggle Send SNMP traps to ON for network monitoring integration
      3. Toggle Run script to ON for automated remediation
    9. Click FINISH to create the alarm.

    10. Test the alarm configuration:

      1. SSH to an ESXi host as root
      2. To simulate an outage of one of the LAG vmnics, run the following command:

        esxcli network nic down -n vmnic# (replace # with LAG member number)

      3. Verify the alarm triggers in vCenter
      4. To restore the LAG vmnic, run the following command:

        esxcli network nic up -n vmnic# (replace # with LAG member number)

Additional Information

net-vSwitch-LACP

For more information, see the following resources:

VMware Documentation:

Related Knowledge Base Articles:

Additional Monitoring Options:

  • Create a complementary alarm using uplink transition up to monitor LAG recovery
  • Add uplink speed is different trigger to detect degraded connections
  • Configure lag transition down trigger to alert on complete LAG failures

CLI Commands for LAG Status Verification:

# Check LACP configuration and status
esxcli network vswitch dvs vmware lacp status get

# View LAG configuration details
esxcli network vswitch dvs vmware lacp config get

# Check individual vmnic status
esxcli network nic list