Unable to prepare ESXi hosts for NSX due to existing stale Lockdown Mode exception user(s). Removing NSX may also fail.
search cancel

Unable to prepare ESXi hosts for NSX due to existing stale Lockdown Mode exception user(s). Removing NSX may also fail.

book

Article ID: 372410

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • When upgrading NSX, the ESXi host transport nodes show in a Failed state for NSX Configuration and produce this error:

    Host Configuration: Failed to send the HostConfig message.

    [TN=TransportNode/<Transport_Node_UUID>]. Reason: Failed to send HostConfig RPC to MPA TN:<Transport_Node_UUID>. Error: Unable to reach client <Transport_Node_UUID>, application SwitchingVertical. LogicalSwitch full-sync: LogicalSwitch full-sync realization query skipped.

  • ESXi hosts being prepared as NSX transport nodes may fail installation. This may happen at 67% completion or later during the configuration process. 

  • Checking /var/run/log/nsxaVim.log of ESXi host failing NSX installation/configuration should show logs similar to below:   

nsxaVim: [2102356]: INFO Entered update lockdown exception to [add] user [nsx-user]^@
nsxaVim: [2102356]: INFO Adding user nsx-user in lockdown exception list^@
nsxaVim: [2102356]: WARNING User <user name> does not exist retrying updating exception list^@  

  • Ping and netcat connectivity from Host Transport Node to NSX managers will still be working fine.
  • ESXi command 'del nsx' may also fail while trying to remove NSX from the host in this state.

Environment

VMware NSX 4.x
VMware NSX-T Datacenter 3.x

Cause

If the HostClient or vCenter Lockdown exception list includes Active Directory users which are subsequently removed from the AD domain server, the ESXi host will not automatically remove the user from the list of lockdown exceptions. 

This "stale" Lockdown mode exception user can cause the nsxaApp service to go down on the ESXi host, which in turn will prevent the Host from successfully being configured as NSX transport node.

Resolution

  • Remove the user account, which is mentioned in the log /var/run/log/nsxaVim.log, as above, from the lockdown exception list.
    • Select the ESXi Host in VC
    • Navigate to Configure>System>Security Profile>Lockdown Mode
    • Select Edit
    • Select Exception Users
    • Click the 3 dots next to the user throwing the "does not exist" error in /var/run/log/nsxdavim.log
    • Remove User
  • After above steps the status of the Transport Node in NSX will show as Success in few minutes, if not restart the nsx-opsagent service using this command:  /etc/init.d/nsx-opsagent restart and complete the upgrade
  • Add the user back to the exception list if required
  • Retry the the ESXi host configuration

Note: The user may also exist in the HostClient UI, please review and remove the mentioned user from the HostClient UI if it exists. 

Additional Information

The NSX installation process workflow depends on successful local user authentication on the host, which is why any missing user in the lockdown exclusion list, or other forms of discrepancies in Security Profiles applied to the host, may all cause the specific error "Failed to send HostConfig RPC to MPA TN:<Transport_Node_UUID>. Error: Unable to reach client <Transport_Node_UUID>, application SwitchingVertical."

Related KB: VMs unable to migrate on to NSX upgraded ESXi hosts by DRS/Manually