Hosts in NSX keep flapping from Success and Install Failed
search cancel

Hosts in NSX keep flapping from Success and Install Failed

book

Article ID: 425174

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

The hosts in the NSX UI keep flapping from reading "Success" to "Install Failed". 

  • The following error messages may be seen on the ESX hosts.
    Transport Node ErrorsHost configuration: Failed to send the HostConfig message. [TN=TransportNode/#######-####-####-####-cb09512d80f9]. Reason: Failed to send HostConfig RPC to MPA TN:#######-####-####-####-cb09512d80f9. Error: Unable to reach client #######-####-####-####-cb09512d80f9, application SwitchingVertical. LogicalSwitch full-sync: LogicalSwitch full-sync realization query skipped.
    “Failed to get response from NSX-SFHC component.”
    Failed to install software on host. Solution apply failed because the vSphere Lifecycle image contains either a new ESXi version or a new addons version or new components. Please proceed to the vSphere Client Lifecycle Manager to update ESXi or addons or components along with the solution 'com.vmware.nsxt' Solution apply failed on host: '<HOST-NAME>'. The vSphere Lifecycle Manager image contains a new 'Cisco-UCS-Addon-ESXi' addon version '4.3.6-a'. Please proceed to the vSphere Client Lifecycle Manager to update the addon and the solution 'com.vmware.nsxt'Solution apply failed on host: '<HOST-NAME>'

Cause

  • Certificates were expired on 1 local NSX Manager.
  • This caused the host to alternate between managers it is synching from/to.

Resolution

  1. Run the CARR script via Using Certificate Analyzer, Results and Recovery (CARR) Script to fix certificate related issues in NSX
      • This will need to be done on both sides of the NSX federation. 
  2. Once the CARR script has finished, perform the steps listed in the "resolution" section of this KB on the NSX Global Manager: Error Code: 500016 when trying to view group members or checking the DFW rule status on the Global Manager (Error: I/O error)

Once the certificates have been renewed, the host status should now be stable. At this stage only the vSphere Lifecycle errors should be present.