Edge NIC link status down alarm
search cancel

Edge NIC link status down alarm

book

Article ID: 330486

calendar_today

Updated On: 12-20-2024

Products

VMware NSX

Issue/Introduction

Title: Alarm for Edge NIC link status down
Event ID: edge_health.edge_nic_link_status_down
Alarm Description

  • Purpose: Indicates Edge NIC link status.
  • Impact: Traffic drop will be observed.

Environment

VMware NSX-T Data Center
 
Edge Form factors:
  • Bare Metal Edge
  • VM Edge

Cause

Alarm is raised when the NIC link is physically down. 

Resolution

Steps to Resolve
For 3.0.0 and higher

On the Edge node confirm if the NIC link is physically down by invoking the NSX CLI command 'get physical-port <port-name>'.

Sample output for get physical-port on fp-eth0 interface:

Recommended Action:

Depending on the form factor, 

  • On a Baremetal Edge if NIC is down verify the physical cable connection.
  • For VM Edge, on the vCenter UI, make sure that all the network adaptors of this Edge VM are connected to a Virtual Switch.
  • On both Baremetal Edge and VM Edge, if the alarm is raised on an unused interface, the following steps can be used to remove the interface from Dataplane. This step requires Edge reboot.
  • Note: If you are not interested in removing the unused interface, the following steps need not be performed and the alarm on an unused interface can be ignored. 
  1. Manually, resolve the NIC link stats down alarm on the Manager UI. 
  2. Run the following commands to get the PCI IDs of the interfaces: 
    1. get dataplane device list

      Note: This would return the output in the correct PCI format, but, however, it wouldn't match to the interface.

      Sample Output:

      0000:04:00.0 - VMXNET3 Ethernet Controller  | Vendor: VMware
      0000:0b:00.0 - VMXNET3 Ethernet Controller  | Vendor: VMware
      0000:13:00.0 - VMXNET3 Ethernet Controller  | Vendor: VMware
      0000:1b:00.0 - VMXNET3 Ethernet Controller  | Vendor: VMware

    2. get interface <interface name>

      Note:
      This would return the output with wrong PCI format, but can match to the interface.

      Sample Output:

      get interface fp-eth3
      Mon Nov 11 2024 UTC 12:16:39.370
      Interface: fp-eth3
      ID: 3
      Link status: up
      MAC address: 00:50:56:##:##:##
      MTU: 1500
      PCI: 0000:04:00:00

  3. Set the device list. Include only the list of PCI IDs of the interfaces to be used by Dataplane. 

    The set command expects the PCI ID in a specific format. 
    Use the PCI ID format obtained in Step 2, part a: set dataplane device list <list of PCI ID of all used interfaces>.

    For example in the 2nd step a part, where the PCI IDs that were received are:

    0000:04:00.0, 0000:0b:00.0, 0000:13:00.0, 0000:1b:00.0

    The command that needs to be run to add the interfaces to the device list is: 

    set dataplane device list 0000:04:00.00,0000:0b:00.0,0000:13:00.0,0000:1b:00.0

  4. Reboot the edge node.

    Note: The removed interface can be added back to Dataplane, by adding the interface's PCI ID to the device list using the command specified in Step #3

  5. Changing device list updates the Fast path interface (Fp-ethX) name to MAC address mapping. Update Uplink interface in the Edge Transport Node’s Uplink Profile after the Edge reboots. The new Fast path interface name can be obtained by matching the MAC address to interface name using the ‘get interfaces’ cli output.

    Note: The removed interface can be added back to Dataplane, by adding the interface's PCI ID to the device list using the command specified in Step #3. This has to be followed by updating the Uplink in Transport Node uplink profile. 

Maintenance window required for remediation? Yes

Additional Information

Set the device list. Include only the list of PCI IDs of the interfaces to be used by Dataplane. The set command expects the PCI ID in a specific format. Use the PCI ID format obtained in Step #3: set dataplane device list <list of PCI ID of all used interfaces>