"Failure Domain Down" alarm is reported even though edge node is healthy
search cancel

"Failure Domain Down" alarm is reported even though edge node is healthy

book

Article ID: 316121

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • "Failure Domain Down" alarm is reported.

  • Resolving the alarm doesn't allow to keep it cleared.
  • No traffic impact is seen.
  • Errors similar to the following example can be seen in /var/log/syslog on the affected NSX Edge node:
    2023-08-16T07:51:41.958Z mynsxedge01 NSX 38252 - [nsx@6876 comp="nsx-edge" subcomp="python" username="root" level="ERROR" errorCode="('CLI110',)"] Failed to get Manager connection status: sudo: Account or password is expired, reset your password and try again#012sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper#012sudo: unable to change expired password: Authentication token manipulation error
    2023-08-16T07:53:47.876Z mynsxedge01 NSX 3315 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="CRITICAL" eventFeatureName="edge_health" eventType="failure_domain_down" eventSev="critical" eventState="On" entId="00000000-####-####-####-444444444444"] All members of failure domain 00000000-####-####-####-444444444444 are down.
  • And in /var/log/nvpapi/api_server.log:
    2023-08-16T07:51:37.824Z nsx_monitoring.clientlibrary.event_source CRITICAL All members of failure domain 00000000-####-####-####-444444444444 are down.

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX-T Data Center 3.x

VMware NSX 4.0 > This issue is resolved in NSX 4.1.2.x
 

Cause

The NSX Manager is unable to fetch the correction information from the NSX Edge node due to the root password expiration.

Resolution

This behavior is resolved in NSX version 4.1.2.

Workaround:

Renew the root password on the affected NSX Edge node.
This can by one of the following methods:

  • Method A:
    1. From the VM console or SSH on the NSX Edge node.
    2. Authenticate with the admin user.
    3. Update the root password:
      mynsxedge01> set user root password
      Current password: <old_password>
      New password: <new_password>
      Retype new password: <new_password>
  • Method B:
    1. From the VM console or SSH on the NSX Edge node.
    2. Attempt to authenticate with the root user.
    3. Usually, login prompt asks you to change password as follows:
      root@mynsxedge01's password: <old_password>
      You are required to change your password immediately (password expired)
      Changing password for root.
      Current password: <old_password>
      New password: <new_password>
      Retype new password: <new_password>

Additionally, if desired, you may change the password expiration:

  1. Connect to the NSX Edge node as admin via console or SSH.
  2. Clear the password expiration for user root:
    clear user root password-expiration
  3. Query the password expiration for user root:
    get user root password-expiration

Additional Information