"Failure Domain Down" and "Password Expiry" alarms are reported for nodes.
search cancel

"Failure Domain Down" and "Password Expiry" alarms are reported for nodes.

book

Article ID: 316121

calendar_today

Updated On:

Products

VMware NSX VMware NSX-T Data Center

Issue/Introduction

  • "Failure Domain Down" alarm is reported.

  • Resolving the alarm doesn't allow to keep it cleared.
  • No traffic impact is seen.

  • Errors similar to the following example can be seen in /var/log/syslog on the affected NSX Edge node:
    2023-08-16T07:51:41.958Z mynsxedge01 NSX 38252 - [nsx@6876 comp="nsx-edge" subcomp="python" username="root" level="ERROR" errorCode="('CLI110',)"] Failed to get Manager connection status: sudo: Account or password is expired, reset your password and try again#012sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper#012sudo: unable to change expired password: Authentication token manipulation error
    2023-08-16T07:53:47.876Z mynsxedge01 NSX 3315 - [nsx@6876 comp="nsx-edge" subcomp="node-mgmt" username="root" level="CRITICAL" eventFeatureName="edge_health" eventType="failure_domain_down" eventSev="critical" eventState="On" entId="00000000-####-####-####-444444444444"] All members of failure domain 00000000-####-####-####-444444444444 are down.
  • And in /var/log/nvpapi/api_server.log:
    2023-08-16T07:51:37.824Z nsx_monitoring.clientlibrary.event_source CRITICAL All members of failure domain 00000000-####-####-####-444444444444 are down.

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX-T Data Center 3.x
VMware NSX 4.x



Cause

The NSX Manager is unable to fetch the correction information from the NSX Edge node due to the root password expiration.
The password expiration could also be verified from the NSX UI alarms page.

Resolution

- NSX version starting 4.1.2 - "Failure domain"  and "password expiry" alarms are excepted in alarm section when password expired. 

- If password is changed before expiry then failure domain alarm will not show up.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

Workaround:

Renew the root password on the affected NSX Edge node.
This can by one of the following methods:

  • Method A:
    1. From the VM console or SSH on the NSX Edge node.
    2. Authenticate with the admin user.
    3. Update the root password:
      mynsxedge01> set user root password
      Current password: <old_password>
      New password: <new_password>
      Retype new password: <new_password>
  • Method B:
    1. From the VM console or SSH on the NSX Edge node.
    2. Attempt to authenticate with the root user.
    3. Usually, login prompt asks you to change password as follows:
      root@mynsxedge01's password: <old_password>
      You are required to change your password immediately (password expired)
      Changing password for root.
      Current password: <old_password>
      New password: <new_password>
      Retype new password: <new_password>

Additionally, if desired, you may change the password expiration:

  1. Connect to the NSX Edge node as admin via console or SSH.
  2. Clear the password expiration for user root:
    clear user root password-expiration
  3. Query the password expiration for user root:
    get user root password-expiration

Additional Information