vMotion fails with Error "An error occurred during host configuration: Failed to attach all ports."
search cancel

vMotion fails with Error "An error occurred during host configuration: Failed to attach all ports."

book

Article ID: 398885

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • ESXi host fails to enter maintenance mode due to vMotion issue
  • The vMotion task fails with error "An error occurred during host configuration: Failed to attach all ports."
  • Issue is seen with DRS triggered vMotion tasks in the cluster
  • The issue is seen with vm's connected to NSX portgroups only
  • vCenter vpxd.log file reports the below error

    2025-05-23T01:01:28.631+10:00 error vpxd[3914757] [Originator@6876 sub=VmProv opID=CdrsLoadBalancer-######-#####-01-01] Local-VC Host Migrate failed at vpx.vmprov.InitiateDestination for poweredOn VM '########' (vm-32273, ds:///vmfs/volumes/vsan:###########-###########/b37d2664-####-3be8-####-#########/########.vmx) on host-3####4 (#.#.#.#) in pool resgroup-#### with ds ds:///vmfs/volumes/vsan:###########-###########/ to host-## (#.#.#.#) in pool resgroup-##### with ds ds:///vmfs/volumes/vsan:###########-###########/ with migId ############### with fault vim.fault.PlatformConfigFault:  as Operation: Local-VC_DRS_NonMM_ComputevMotion

    2025-05-23T01:01:28.718+10:00 error vpxd[3914757] [Originator@6876 sub=Default opID=CdrsLoadBalancer-#######-######-01-01] [VpxLRO] -- ERROR lro-1194397910 --  --  -- VmprovWorkflow: :vmodl.fault.SystemError
    --> Result:
    --> (vmodl.fault.SystemError) {
    -->    faultCause = (vim.fault.PlatformConfigFault) {
    -->       faultCause = (vmodl.MethodFault) null,
    -->       faultMessage = <unset>,
    -->       text = "Failed to attach all ports"
    -->       msg = "An error occurred during host configuration: Failed to attach all ports."
    -->    },
    -->    faultMessage = <unset>,
    -->    reason = "Undeclared fault"
    -->    msg = "Received SOAP response fault from [<<io_obj p:##########, h:247, <UNIX ''>, <UNIX '/var/run/envoy-hgw/hgw-pipe'>>, /hgw/host-24/vpxa>]: initiateDestination
    --> Received SOAP response fault from [<<io_obj p:#######, h:47, <TCP '127.0.0.1 : #####'>, <TCP '127.0.0.1 : ####'>>, /sdk>]: initiateDestination
    --> Method vim.host.VMotionManager.initiateDestination threw undeclared fault of type vim.fault.PlatformConfigFault"

  • The destination ESXi transport node status is Unknown in the NSX manager
  • Manual vMotion of vms are working fine to all the transport node except the one in Unknown status
  • On the destination ESXi host, /var/log/nsx-syslog.log file reports that the opsagent-proxy-connection status is down

    2025-05-22T14:58:54Z In(182)[+] nsx-sha:       agent_status {
    2025-05-22T14:58:54Z In(182)[+] nsx-sha:         type: NSX_OPSAGENT
    2025-05-22T14:58:54Z In(182)[+] nsx-sha:         status: UP
    2025-05-22T14:58:54Z In(182)[+] nsx-sha:         connection_status {
    2025-05-22T14:58:54Z In(182)[+] nsx-sha:           name: "opsagent-proxy-connection"
    2025-05-22T14:58:54Z In(182)[+] nsx-sha:           status: DOWN

  • /var/log/nsx-syslog.log from the Destination ESXi host reports an error starting nsxaApp

    2025-05-20T11:14:00.280Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsxa" tid="2105886" level="INFO"] [Run] Starting nsxaApp
    2025-05-20T11:14:00.281Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="INFO"] CtxtCore: Context library initialized successfully
    2025-05-20T11:14:00.281Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="INFO"] CtxtHealth: si Initializing for AppId: 7777, AppType: Context-Health with CtxtChannel
    2025-05-20T11:14:00.281Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="mpa-client" tid="2105732" level="INFO"] Context-Health Register mpa client
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsx-rpc" tid="2105732" level="INFO"] RpcServiceRegistry[2] Registering local service vmware.nsx.context.NxgiHostService
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsx-rpc" tid="2105732" level="INFO"] NsxRpcServer: starting
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsx-rpc" tid="2105732" level="INFO"] NsxRpcServer: started accepting connections
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="mpa-client" tid="2105732" level="INFO"] [Context-Health] CreateAppSpecificInstance End (MPA_CLIENT_PURGE_TIMER: 300000)
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="INFO"] Host Registration for appId 7777, appType Context-Health MpaClientStub 0xcaf4dbf420
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="mpa-client" tid="2105732" level="INFO"] [Context-Health] Starting MPA client thread...
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="tnproxy-client" tid="2105732" level="INFO"] [Context-Health] TnProxyClientCommon SetTnProxyConnectionUp
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="tnproxy-client" tid="2105732" level="INFO"] [Context-Health] TnProxyClientCommon SetTnProxyConnectionDown
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="tnproxy-client" tid="2105732" level="INFO"] [Context-Health] TnProxyClientCommon SetUpdateStubs
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="INFO"] MpaChannel: Connected to MPA, appID: 7777, appType: Context-Health client stub 0xcaf4dbf420
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="INFO"] NxgiConfig: Registered with UPM for GI_SERVICE and ENTITY_VM. VmConfig started successfully.
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="INFO"] EPSecUtil: Successfully configured Mux
    2025-05-20T11:14:00.282Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="INFO"] NxgiConfig: Configured Mux with context engine config
    2025-05-20T11:14:00.283Z Wa(180) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="mpa-client" tid="2105732" level="WARNING"] [Context-Health] getMPStubs No stub present for APH ()
    2025-05-20T11:14:00.283Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="mpa-client" tid="2105732" level="INFO"] [Context-Health] RpcMsgHandler for APH () has invalid stub
    2025-05-20T11:14:00.283Z In(182) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="mpa-client" tid="2105732" level="INFO"] [Context-Health] SendRequest: Failed to send msg Master APH,  type (com.vmware.nsx.context.HealthStatusMsg), correlationId (), trackingIdStr (59####c-####-#####-####-d689####c48), ret (-1)
    2025-05-20T11:14:00.283Z Er(179) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="ERROR" errorCode="CTX251"] MpaChannel: Failed to send msg to rmqClient, appType: Context-Health, ret: -1
    2025-05-20T11:14:00.283Z Er(179) nsx-opsagent[2105732]: NSX 2105732 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="ctxteng" tid="2105732" level="ERROR" errorCode="CTX251"] CtxtChannel: Failed to send message to NSX Manager, error: -1

  • /var/log/nsxaVim.log The ESXi host is complaining about domain User domain\user_name does not exist

    2025-05-26T01:21:35Z In(14) nsxaVim: [2106312]: INFO Entered update lockdown exception to [add] user [nsx-user]^@
    2025-05-26T01:21:35Z In(14) nsxaVim: [2106312]: INFO Adding user nsx-user in lockdown exception list^@
    2025-05-26T01:21:35Z Wa(12) nsxaVim: [2106312]: WARNING User domain\user_name does not exist retrying updating exception list^@

  • You also see that the ESXi host is not joined to the domain

Environment

VMware NSX
VMware vSphere 8.0

Cause

ESXi host got removed from the domain and lost access to the domain user added to the lockdown exceptions list.

If the host is removed from the domain, the ESXi host will not automatically remove the user from the list of lockdown exceptions. This domain user can cause the nsxaApp service to go down on the ESXi host, which in turn will prevent the ESXi transport node from successfully being connected to the NSX manager as transport node. 

Resolution

To resolve the issue, remove the domain user reported in the /var/log/nsxaVim.log from the ESXi lockdown exception list or add the ESXi host back to domain.

Steps to remove user from lockdown exception list:

    1. Login to the vCenter server UI
    2. Remove the user account, which is mentioned in the log /var/log/nsxaVim.log, as above, from the lockdown exception list.
      • Select the ESXi Host in VC
      • Navigate to Configure>Security Profile>Lockdown Mode
      • Select Edit
      • Select Exception Users
      • Click the 3 dots next to the user throwing the "does not exist" error in nsxdavim.log
      • Remove User
    3. After removing the user restart the nsx-opsagent service on the ESXi host using the command:  /etc/init.d/nsx-opsagent restart 
    4. Check and confirm the ESXi transport node status is Up in the NSX and if you can vMotion the vms to this host. 

Note: Once the issue is resolved, if you want to add the domain user back to the lockdown exceptions list, you have to join the ESXi host to the domain first and then add the user following the same steps you followed to remove the user. 

Additional Information

Unable to prepare ESXi hosts for NSX due to existing stale Lockdown Mode exception user(s).

If the above steps do not resolve the issue, raise a support ticket with Broadcom support selecting NSX as the product. 

Handling Log Bundles for offline review with Broadcom support