NSX-T host upgrade fails with nsxt-vsip: Consumed resource count of module is not zero
search cancel

NSX-T host upgrade fails with nsxt-vsip: Consumed resource count of module is not zero

book

Article ID: 324205

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

  • NSX-T ESXi Host upgrades fail while upgrading from NSX-T Data Center versions earlier than 3.0.0.
  • Upgrade Coordinator display a long error string which contains the following exception :
  "KernelModulesException: Failed to unload module nsxt-vsip: Cannot remove module nsxt-vsip: Consumed resource count of module is not zero"
  • The ESXi /var/log/vmkernel.log contains similar messaging to this example where field of fqdn value highlighted is non-zero :
2019-12-12T15:49:33.510Z cpu30:22424799)ExportStateTLV total: 81423, tables: 11512/11496/32, rules: 65206/65190/3, states: 1551/1535/3, attrs: 74/58/1, algs: 804/788/20, algports: 136/120/2, fqdn: 250/234/2, miscs: 1874/1858/5



Environment

VMware NSX-T Data Center 2.x
VMware NSX-T Data Center

Cause

This issue is caused by a memory leak that occurs during vMotion and is associated with the DFW fqdn data structure.
This results in memory still being allocated to the vsip module even after the ESXi host has entered the maintenance mode, preventing the older version of NSX-T software from being removed.

Resolution

This issue is resolved in NSX-T Data Center 3.0.
Upgrades from NSX-T Data Center 3.0 and higher will not experience this issue.

Workaround:

  • Upgrades from any NSX-T release that does not have the fix can be impacted by this issue.

ESXi Hosts can be configured to automatically reboot as part of the upgrade process to avoid failures

  • Find the group id for the cluster using GET api/v1/upgrade/upgrade-unit-groups?component_type=HOST or from UI
  • GET api/v1/upgrade/upgrade-unit-groups/<group id>
  • In the above response, change extended_configuration value by updating {"key" : "rebootless_upgrade", "value" : "false"}.
  • PUT the modified payload to api/v1/upgrade/upgrade-unit-groups/<group uuid>

 
OR

For hosts that have already failed

  • Confirm if the ESXi Host is in the vSphere maintenance mode and reboot it to clear the stale dvfilters.
  • Post reboot, the ESXi host should continue to remain in the vSphere maintenance mode.
  • Confirm if the ESXi Host is in NSX Maintenance Mode by checking in :
    • System > Fabric > Nodes > Host Transport Nodes
  • If it is, select the ESXi host and from Actions and click "Exit Maintenance Mode".
  • On the Upgrade Coordinator, click Reset to clear the error.
  • Restart the ESXi Host upgrade.
  • It will now retry the problematic host and allow the upgrade to proceed.