Edge showing MPA Connectivity Down after storage/vSAN issues
search cancel

Edge showing MPA Connectivity Down after storage/vSAN issues

book

Article ID: 388538

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

- If any storage related issues happen, and Edge status on NSX shows as "MPA Connectivity Down", this indicates that the connectivity from Edge to Manager (and vice versa)  is not happening correctly on port 1234. The RPC (Remote Procedure Call) Channel that the NSX manager uses to communicate with Edge on port 1234 that is the MP (Management Plane) Channel on port 1234 is not working:

- Next check the connectivity from Edge to NSX and vice versa on MP connectivity

    From Edge: nc -zv <NSX-Manager-IP> 1234  --> To check the MP channel connectivity --> Repeat this towards all the managers

   Connectivity was successful:

       

  Then Edge CLI (admin): get managers

                                 ----> Here there was not managers seen even though there was good connectivity

From the Edge logs we can see: 

/var/log/syslog:

[WARNING] No aph found in appliance-info.xml

Also one more error may be seen:

[WARNING] Could not read a valid uuid from /etc/vmware/nsx/host-cfg.xml

Environment

VMware NSX

Cause

Few times storage/datastore issues may cause files to be corrupt on Edge transport node. If the issue continues even after fixing the file system (using this KB: https://knowledge.broadcom.com/external/article?articleNumber=320303), the file /etc/vmware/nsx/appliance-info.xml was still empty for some Edges and also for some nodes this file /etc/vmware/nsx/host-cfg.xml may have missing its own UUID

Resolution

Workaround:

1. When checked the file at location: cat /etc/vmware/nsx/appliance-info.xml, it was empty but on a good Edge it has the appliance proxies info (with all the 3 manager details in it), meaning all the manager IPs with port 1234 and their cert details are present in that file. So copied the details from good Edge appliance-info.xml to this bad Edge which did not have it

2. After that restarted the nsx-proxy using: /etc/init.d/nsx-proxy restart to take the changes into effect.

3. Once this is completed, we can now see Edge communicating with NSX managers and vice versa bringing the status of the Edge on NSX manager as 'Success', and also get managers on admin CLI should now show all managers and status as 'Connected'

4. If incase: --> get managers --> Shows all managers but in state is in Standby, please check the logs (/var/log/syslog) and we can see an error in hosts-cfg.xml that Edge's UUID is missing

Copy the UUID from NSX manager UI and place it in the file hosts-cfg.xml and restart nsx-proxy should fix the MPA connectivity issues