NSXA down error on ESXi hosts causing vMotion to fail
search cancel

NSXA down error on ESXi hosts causing vMotion to fail

book

Article ID: 406081

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • An attempt to migrate (vMotion) a VM from one ESXi host to another fails with the error NSXA is down as seen in the screenshot below:
  • Additionally, the ESXi hosts with NSXA down may appear with "Unknown" status in the NSX UI under System -> Fabric -> Host
  • The following loglines seen in the log file /var/log/vmware/appl-proxy-rpc.log indicate the presence of this issue:

    <Timestamp> nsxmgr01 NSX 82410 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="82448" level="WARNING"] StreamConnection[106###5135 Connecting to unix:///var/run/vmware/appl-proxy/aph.sock(pid:386085 uid:113 gid:117) sid:1069085135] Couldn't connect to 'unix:///var/run/vmware/appl-proxy/aph.sock(pid:386085 uid:113 gid:117)' (error: 2-No such file or directory)
    <Timestamp> nsxmgr01 NSX 82410 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="82448" level="WARNING"] StreamConnection[106###5135 Error to unix:///var/run/vmware/appl-proxy/aph.sock(pid:386085 uid:113 gid:117) sid:-1] Error 2-No such file or directory
    <Timestamp> nsxmgr01 NSX 82410 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-rpc" tid="82448" level="WARNING"] RpcConnection[106###5135 Connecting to unix:///var/run/vmware/appl-proxy/aph.sock(pid:386085 uid:113 gid:117) 0] Couldn't connect to unix:///var/run/vmware/appl-proxy/aph.sock(pid:386085 uid:113 gid:117) (error: 2-No such file or directory)
    <Timestamp> nsxmgr01 NSX 82410 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-rpc" tid="82448" level="WARNING"] RpcTransport[1] Unable to connect to unix:///var/run/vmware/appl-proxy/aph.sock(pid:386085 uid:113 gid:117): 2-No such file or directory

    Note that the presence of these loglines doesn't ensure you will see the nsxa down error on ESXi hosts. 

  • The presence of the following lines in the output of the net-dvs -l command on the affected ESXi host additionally confirms the presence of this issue:
    com.vmware.common.opaqueDvs.status.component_list = nsxa,vswitch,lcp.ccpSession,lcp.liveness,lcp.kcpSyncStatus , propType = CONFIG
    com.vmware.common.opaqueDvs.status.component.nsxa = down ,      propType = CONFIG
    com.vmware.common.opaqueDvs.status.component_list = nsxa,vdl2,vswitch,lcp.ccpSession,lcp.liveness,lcp.vdl2SyncStatus,lcp.kcpSyncStatus , propType = CONFIG
    com.vmware.common.opaqueDvs.status.component.nsxa = down ,      propType = CONFIG

     

Cause

Due to a broken connection between the APH and the proton service in one or more NSX Manager appliances, nsxa service in ESXi host may be down causing vMotion to fail.

 

Resolution

This issue is resolved in VMware NSX 4.2.0, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

Workaround:

Identify the NSX Manager inside which this APH-proton connection is broken and restart the proton service in this NSX Manager appliance:

  • Login to NSX Manager one by one over SSH connection and look for the following error using grep command : 
    • grep "Couldn't connect to unix:///var/run/vmware/appl-proxy/aph.sock" /var/log/vmware/appl-proxy-rpc.log
  • Restart the proton service:
    • systemctl restart proton

Or, it may be easier in most cases to reboot the NSX Manager appliance.