NSX-T host upgrade fails with nsxt-vsip: Consumed resource count of module is not zero
search cancel

NSX-T host upgrade fails with nsxt-vsip: Consumed resource count of module is not zero

book

Article ID: 324205

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • NSX-T ESXi host upgrade fails when upgrading from NSX-T Data Center versions earlier than 3.0.0.
  • Upgrade Coordinator display a long error string which contains the following exception
  "KernelModulesException: Failed to unload module nsxt-vsip: Cannot remove module nsxt-vsip: Consumed resource count of module is not zero"
  • The ESXi /var/log/vmkernel.log contains similar messaging to this example where field of fqdn value highlighted is non-zero
2019-12-12T15:49:33.510Z cpu30:22424799)ExportStateTLV total: 81423, tables: 11512/11496/32, rules: 65206/65190/3, states: 1551/1535/3, attrs: 74/58/1, algs: 804/788/20, algports: 136/120/2, fqdn: 250/234/2, miscs: 1874/1858/5


Environment

VMware NSX-T Data Center 2.x
VMware NSX-T Data Center

Cause

This issue is caused by a memory leak that occurs during vmotion and is associated with the DFW fqdn data structure.
This results in memory still being allocated to the vsip module even after the ESXi host has entered maintenance mode preventing the older version of NSX-T software from being removed.

Resolution

This issue is resolved in NSX-T Data Center 3.0.
Upgrades from NSX-T Data Center 3.0 and higher will not experience this issue.

Workaround:
Upgrades from any NSX-T release that does not have the fix can be impacted by this issue.

ESXi Hosts can be configured to automatically reboot as part of the upgrade process to avoid failures
  • Find the group id for the cluster using GET api/v1/upgrade/upgrade-unit-groups?component_type=HOST or from UI
  • GET api/v1/upgrade/upgrade-unit-groups/<group id>
  • In the above response, change extended_configuration value by updating {"key" : "rebootless_upgrade", "value" : "false"}.
  • PUT the modified payload to api/v1/upgrade/upgrade-unit-groups/<group uuid>
 
or

For hosts that have already failed
  • Confirm the host is in vSphere maintenance mode and reboot it to clear the stale dvfilters.
  • Post reboot, the ESXi host should continue to remain in vSphere maintenance mode.
  • Confirm if the host is in NSX Maintenance Mode by check
    • System > Fabric > Nodes > Host Transport Nodes
  • If it is select the ESXi host and from Actions and click "Exit Maintenance Mode".
  • On the Upgrade Coordinator, click Reset to clear the error.
  • Restart the host upgrade. It will now retry the problem host and allow the upgrade to proceed.