Host ESXi networking missing after host upgraded to 7.0 when NSX-T VIBS were removed manually
search cancel

Host ESXi networking missing after host upgraded to 7.0 when NSX-T VIBS were removed manually

book

Article ID: 337062

calendar_today

Updated On:

Products

VMware NSX VMware vSphere ESXi

Issue/Introduction

ESXi host is upgraded to ESXI 7. After reboot, all host networking is missing. DCUI screen is blank:
 
 
Customize System (F2) shows all host networking options are grayed out:
 

Enable ESXi Shell and log in as root and networking commands fail.
    NOTE: net-dvs sometimes runs and sometimes fails as below
 
  net-dvs -l                  returns  failed to get config data: Not Initialized
  esxcli network ip interface list      returns  "Unable to get node: Not found"
  esxcfg-vmknic -l      returns nothing
 


If the N-VDS was removed manually, you may receive the error:

Configuration Upgrade Failure
Please reboot to rollback to the older version
Failed Modules:
/usr/lib/vmware/configmanager/upgrade/lib/libupgradeddvsconfig.so





Environment

VMware NSX-T
VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.0

Cause

The host was previously used for NSX-T and vibs were manually removed via esxcli software vib remove -n <vibs name>
This is an incorrect uninstall method and leaves stale objects on the host, including the N-VDS.   
The presence of the N-VDS prevents host networking from initializing and must be removed.

Resolution

This is a known issue that requires the below mentioned workaround in this article

Workaround:

You can either reinstall ESXi host from scratch or create a new vSwitch and management vmkernel.  

On rare occasion, creating a new vSwitch will not work.

See KB Configuring vSwitch or vNetwork Distributed Switch from the command line in ESXi/ESX  for information steps on creating the vSwitch via the CLI


Process to remove the stale N-VDS before upgrading

The below process has been confirmed in Broadcom lab environment to be non-disruptive.

However, Broadcom support always recommends that customers make changes of this nature during a maintenance window.

Additionally Broadcom recommends that customers always ensure full backups prior to any maintenance window.

These steps should not be executed without Broadcom Support actively engaged. To engage Broadcom support, please follow Creating and managing Broadcom cases
 

1. SSH into each ESXi host

2.  Search for a stale N-VDS by checking for the name of the orphaned NSX DVS

net-dvs -l | less
 
Note: you may have to go down many pages to get to the NSX DVS switch details

Look for the following output - Note: This Example is the NSX DVS configuration as seen from support bundles

-----SNIP----
switch cb cf 7b 58 37 50 43 b2-8b f1 95 xx xx xx xx xx (vswitch)
        max ports: 10752
        global properties:
                com.vmware.common.opaqueDvs = true,    propType = CONFIG
                com.vmware.common.alias = nvds-overlay-acdc,   propType = CONFIG
                com.vmware.common.uplinkPorts:
                        uplink-1, uplink-2
                        propType = CONFIG
                com.vmware.common.portset.mtu = 9000 ,  propType = CONFIG
                com.vmware.etherswitch.cdp = LLDP, listen
                        propType = CONFIG
                com.vmware.common.respools.version = version3 ,         propType = CONFIG
---END SNIP---
 

3.  Remove the orphaned NSX DVS

net-dvs -d nvds-overlay-acdc
net-dvs --persist    <----- if you forget this step, the stale nvds will reappear after reboot!!!!
 

4.  Check that the orphaned NSX DVS is now removed

net-dvs -l | grep -i nsx
 

5. Remove stale entries from /etc/vmware/esx.conf
    a. Travel to /etc/vmware
      cd /etc/vmware
    b. Backup the config file
     cp esx.conf esx.conf.bkup
    c.  Edit the config file
     vi esx.conf
    d. Find the stale config lines and delete them by pressing "dd" 
search for
      /net/dvswitch/child[xxxx]/dvsClassName = "vswitch"

NOTE: xxxx will be a four digit number.  

        i. Remove all the lines that have this same number
        ii. Sample lines.  ALL of these lines need to be removed
        /net/dvswitch/child[0001]/uplinks/child[0000]/connectionId = "0"
    /net/dvswitch/child[0001]/uplinks/child[0000]/dvpId = "up1"
    /net/dvswitch/child[0001]/uplinks/child[0000]/pnic = "vmnic8"
    /net/dvswitch/child[0001]/uplinks/child[0001]/pnic = "vmnic9"
    /net/dvswitch/child[0001]/uplinks/child[0001]/dvpId = "up2"
    /net/dvswitch/child[0001]/uplinks/child[0001]/connectionId = "0"
    /net/dvswitch/child[0001]/name = "DvsPortset-1"
    /net/dvswitch/child[0001]/numPorts = "64"
    /net/dvswitch/child[0001]/dvsClassName = "vswitch"
      iii. save the file.   ":x" or ":wq" will save the file in vi
      iv. run auto_backup.sh to save the changes to the bootbank

6.  Repeat step 4 & 5 on all other hosts in the cluster



Additional Information

Impact/Risks:

Host ESXi networking is missing after host upgrade