NSX Manager reports "EAM Status Down" Alarm comes up and gets resolved itself after few minutes
search cancel

NSX Manager reports "EAM Status Down" Alarm comes up and gets resolved itself after few minutes

book

Article ID: 381091

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • The following NSX UI alarm is observed and self-resolves after a few minutes:
  • Log entries simliar to the below are oberved on the NSX Manager node reporting the "EAM Status Down" alarrn:

    /var/log/proton
     2024-09-24T18:26:32.086Z  WARN EamPollingThread VcUtils 70368 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="manager"] ConnectException occurred
    java.net.ConnectException: Connection timed out (Connection timed out)
     at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_372]
     at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_372]
     at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_372]
     at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_372]
     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_372]
  • /var/log/syslog
    INFO EamPollingThread EamServiceStatusServiceImpl 78912 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Current EAM status UP, updated EAM status DOWN for cmID <CM UUID>
    INFO EamPollingThread EamServiceStatusServiceImpl 78912 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Current EAM status DOWN, updated EAM status UP for cmID <CM UUID>
    INFO EamPollingThread EamServiceStatusServiceImpl 78912 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Current EAM status UP, updated EAM status DOWN for cmID <CM UUID>
    INFO EamPollingThread EamServiceStatusServiceImpl 78912 FABRIC [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Current EAM status DOWN, updated EAM status UP for cmID <CM UUID>

 

Environment

VMware NSX-T Data Center
VMware NSX

Cause

NSX Manager appliances running kernel version 5.15.92 with the ip_blackhole feature enabled may result in certain traffic flows being rejected. This can cause the NSX EAM check to fail.

Kernel version 5.15.92 has the ip_blackhole feature enabled by default.

 

Resolution

This issue is resolved in VMware NSX-T Data Center 3.2.4, available at Broadcom downloads.

This issue is resolved in VMware NSX 4.1.2.2, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

 

Workaround

Open SSH session to NSX Manager as root:

  • To confirm the kernel version:
    uname -a

  • To verify ip_blackhole was enabled:
    cat /proc/sys/kernel/grsecurity/ip_blackhole

    NB:  If the above command returns 1, the ip_blackhole feature is enabled
  • To disable ip_blackhole live (no reboot required):
    echo 0 > /proc/sys/kernel/grsecurity/ip_blackhole

  • To ensure that the change persists after reboot (add the same command as above to the startup script):
    echo 'echo 0 > /proc/sys/kernel/grsecurity/ip_blackhole' >> /opt/vmware/nsx-node-api/bin/set_params.sh

  • To confirm that the ip_blackhole feature was disabled:
    cat /proc/sys/kernel/grsecurity/ip_blackhole

  • Repeat the above steps for all NSX manager nodes.

Additional Information