NSX-T ESXi host upgrade fails when upgrading from 2.4.0/2.4.1 versions
search cancel

NSX-T ESXi host upgrade fails when upgrading from 2.4.0/2.4.1 versions

book

Article ID: 324229

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • NSX-T ESXi host upgrade fails when upgrading from NSX-T Data Center 2.4.0 or 2.4.1
  • NSX-T DFW is configured with a rule of Service Type TCP/UDP/ALG for any of the following ports
   TCP 21   - ALG FTP
   TCP 1521 - ALG ORACLE_TNS
   TCP 111  - ALG SUN_RPC_TCP
   TCP 135  - ALG MS_RPC_TCP
   UDP 69   - ALG TFTP
  • Upgrade Coordinator display a long error string which contains the following exception
   "KernelModulesException: Failed to unload module nsxt-vsip: Cannot remove module nsxt-vsip: Consumed resource count of module is not zero"
  • The ESXi /var/log/vmkernel.log contains the following messaging
2020-01-25T12:22:27.020Z cpu28:98022942)Destroying solution lock.
2020-01-25T12:22:27.020Z cpu28:98022942)Unregistering char device
2020-01-25T12:22:27.024Z cpu28:98022942)WARNING: Heap: 2734: Non-empty heap (vsip-state) being destroyed (avail is 38203008, should be 38203216).
2020-01-25T12:22:27.106Z cpu28:98022942)ALERT: Mod: 5212: Failed to unload module nsxt-vsip, since its consumed resource count is 1. Waiting...
2020-01-25T12:22:32.124Z cpu28:98022942)ALERT: Mod: 5241: Failed to unload module nsxt-vsip, since its consumed resource count is 1. Giving up.


Environment

VMware NSX-T Data Center
VMware NSX-T Data Center 2.x

Cause

The upgrade fails because the 2.4.0/2.4.1 nsxt-vsip module cannot be unloaded and uninstalled from the host.
This occurs because of a memory heap issue caused by the ALG component of the DFW.

Resolution

This issue is resolved in: 
 

VMware NSX-T Data Center 2.4.2, available at VMware Downloads.

VMware NSX-T Data Center 2.5, available at VMware Downloads.

VMware NSX-T Data Center 3.0, available at VMware Downloads.





Workaround:
To workaround this issue after a host upgrade has failed
  • Reboot the ESXi host which will resolve the error condition on the host
  • On the NSX-T UI -> System -> Fabric -> Nodes -> Host Transport Node, ensure the host is not in maintenance mode
  • On the Upgrade Coordinator reset the host upgrade error and restart the host upgrade