MPA Connectivity Down status showing for Edge nodes after DNS failures.
search cancel

MPA Connectivity Down status showing for Edge nodes after DNS failures.

book

Article ID: 432101

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

In VMware NSX, Edges may show an "MPA Connectivity Down" status in the Manager UI. When investigating the Edge CLI or logs, you may observe that the Edge is attempting to connect to a manager using a DNS error message as a hostname.

  • Running commands will report DNS server related IPs and Ports instead of expected values.
    • edge01> get controllers
       Controller IP Port SSL Status Is Physical Master Session State Controller FQDN Failure Reason
             :: 1235 enabled disconnected true down ;; communications error to <DNS Server IP>#53: timed out OTHER_ERROR

    • edge01> get managers
      ;; communications error to <DNS Server IP>#53: timed out Unable to resolve fqdn *

  • Edge syslogs (/var/log/syslog) show nsx-proxy warnings such as below:

Error to ssl://;; communications error to <DNS Server IP>#53: timed out:1235 ... Error 1-Host not found.

  • Reboot of the edges or managers as well as Syncing of the Edges from the GUI  will not resolve the MPA connectivity down state.

 Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX

Cause

This issue occurs when the NSX Manager has "Publish FQDN" enabled but cannot reach its DNS servers. The resulting timeout error message is incorrectly ingested by the Edge as a valid Management Plane FQDN and saved into the Edge's local configuration which is not recoverable as the edge loses connectivity to the managers.

Resolution

This is a known issue impacting VMware NSX.


To manually restore connectivity, you must remove the corrupted entries from the Edge appliance.

  1. Log into the NSX Edge CLI as admin and then switch to root (or log in directly as root if enabled).

  2. Navigate to the configuration directory: cd /config/vmware/edge/

  3. Back up the current configuration: cp appliance-info.xml appliance-info.xml.bak

  4. Either Open appliance-info.xml with a text editor (e.g., vi) and remove the values between <fqdn></fqdn>  and <fqdnv6></fqdnv6> then save and close the file

    or run the below sed

    sed -i '/^<fqdn/d' /config/vmware/edge/appliance-info.xml

     
  5. Restart the NSX Proxy service to apply the changes and repopulate the values correctly.

    /etc/init.d/nsx-proxy restart.

Additional Information

Ensure that your NSX Manager has a stable and reachable DNS configuration before re-enabling "Publish FQDN" to prevent a recurrence of this behavior.