The alarm 'Edge node NIC eth0 link is down' opens and resolves after 3 seconds.
search cancel

The alarm 'Edge node NIC eth0 link is down' opens and resolves after 3 seconds.

book

Article ID: 380983

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

- The alarm 'Edge node NIC eth0 link is down' opens.

- The link down state will be resolved 3 seconds after the alarm opens.

- However, the NSX edge NIC is not down on vSphere.

- The log below is logged in syslog.

/var/log/syslog
yyyy-mm-ddThh:mm:ss.sssZ <edge-name> NSX 1162 - [nsx@6876 comp="nsx-edge" subcomp="nsx-sha" username="nsx-sha" level="CRITICAL" eventFeatureName="edge_health" eventType="edge_nic_link_status_down" eventSev="critical" eventState="On"] Edge node NIC eth0 link is down.
yyyy-mm-ddThh:mm:ss.sssZ <edge-name> NSX 1162 - [nsx@6876 comp="nsx-edge" subcomp="nsx-sha" username="nsx-sha" level="CRITICAL" eventFeatureName="edge_health" eventType="edge_nic_link_status_down" eventSev="critical" eventState="Off"] Edge node NIC eth0 link is up.

Environment

VMware NSX-T Data Center (3.x)

VMware NSX (4.x)

Cause

You can see in /var/log/syslog that "sudo cat /sys/class/net/eth0/operstate" failed just before the alarm occurred.

yyyy-mm-ddThh:mm:ss.sssZ <edge-name> NSX 1225 - [nsx@6876 comp="nsx-edge" subcomp="nsx-sha" username="nsx-sha" level="INFO" s2comp="fork-executor-1"] Exception caught when running cmd '{'cmd': ['sudo', 'cat', '/sys/class/net/eth0/operstate'], 'input': None, 'shell': False, 'timeout': 4, 'check_return': True, 'env': None, 'type': 0, 'proc_tree': True, 'timestamp': 392962.240507027, 'seq': 106666}': {'seq': 106666, 'type': 0, 'executor': 1, 'timestamp': 392962.24113893, 'execute_time': 10.7467318289564, 'stdout': b'up\n', 'stderr': b'', 'exception': TimeoutExpired(['sudo', 'cat', '/sys/class/net/eth0/operstate'], 4)}

Resolution

Running 'sudo cat /sys/class/net/eth0/operstate' will trigger a link down alarm if the result is 'down' or if an exception is encountered during execution. 
If the command fails to execute, it indicates that the NSX Edge was unstable at that time.

Check for infrastructure issues, like temporary CPU contention and degraded storage performance.
Please contact Broadcom support if you need help to identify the cause of the instability.