NSX VIBs were uninstalled using ESXCLI causing network outages
search cancel

NSX VIBs were uninstalled using ESXCLI causing network outages

book

Article ID: 394809

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware NSX

Issue/Introduction

  • During the uninstallation of NSX VIBs on an ESXi host, esxcli was used instead of the recommended "nsxcli -c del nsx".
  • Example of the esxcli command used:
    esxcli software vib remove -n <nsx-vib-name>
  • If the management interface is on a vDS, using this uninstallation command can cause:
    • Network disruption
    • Loss of connectivity to the ESXi management interface
  • When trying to restore the management network on the ESXi host, an issue may occur that prevents the vmk from being assigned correctly.
  • The net-dvs command might show an error:
    "failed to get config data: not initialized"
  • As a result, the standard process to restore management network connectivity fails.

Environment

VMware ESXi 7.0

VMware ESXi 8.0

VMware NSX

Cause

This is caused by incorrect method of removing NSX VIBs

Resolution

Attention

  • The following workaround requires a reboot of the ESXi host.

  • Best used when the host is in maintenance mode, and the cluster can tolerate one node down for vSAN/Storage array.

Pre-steps:

  • Remove LACP configuration from the uplink switch ports that will be assigned to the standard switch (ESXi standard switch does not support LACP).

Workaround:

  1. Access the host console/DCUI.

  2. Switch to Shell: Press Alt+F2.

  3. Log in as the root user.

  4. Run the following command:
    rm -f /etc/vmware/dvsdata.db

  5. Reboot the host.

  6. Log in again to Shell as the root user through the host console.

  7. Run the net-dvs command.

    • If the error message no longer appears, continue with the steps below.

Restoring the ESXi Management Network on a Standard Switch:

  1. Run the following command to add a new standard switch:
    esxcli network vswitch standard add --vswitch-name=vSwitch0

  2. Use esxcli network nic list to view available vmnics.
    Example: vmnic0 is used.

  3. Add the uplink to the standard switch:
    esxcli network vswitch standard uplink add --uplink-name=vmnic0 --vswitch-name=vSwitch0

  4. Create a portgroup for the management network:
    esxcli network vswitch standard portgroup add --portgroup-name=management --vswitch-name=vSwitch0

  5. If the management network is on a VLAN, set the VLAN ID:
    esxcli network vswitch standard portgroup set --portgroup-name=management --vlan-id <VLAN-ID>

  6. Remove the vmk0 interface:
    esxcli network ip interface remove --interface-name=vmk0
    (Note: vmk0 is typically used for management.)

  7. Re-add the vmk0 interface to the management portgroup:
    esxcli network ip interface add --interface-name=vmk0 --portgroup-name=management

  8. Set the static IP for vmk0:
    esxcli network ip interface ipv4 set --interface-name=vmk0 --ipv4=<ipaddress> --netmask=<netmask> --gateway=<gatewayip> --type=static

  9. If necessary, add an interface tag:
    esxcli network ip interface tag add --interface-name=vmk0 --tagname=Management

  10. Test management interface connectivity by pinging from and to another ESXi host.

    • If pings to the other ESXi host succeed but not the other way around, check the default route:

      • List existing routes: esxcli network ip route ipv4 list

      • Add default route if necessary: esxcfg-route -a default <default-gateway-ip>

Clean-up:

  1. Use vCenter UI to locate the ESXi host.

  2. Go to Configure > Networking > Virtual Switches.

  3. Remove any stale distributed switches.

  4. Disconnect the host and remove it from inventory in vCenter.

  5. Re-add the host to the cluster.

  6. Go to Networking and add the required vDS to the host.

    • Ensure at least one uplink for vSwitch0 to maintain management interface connectivity.

  7. If moving the management network back to a vDS, do the following:

    • Select the desired vDS from the host Virtual Switches view.

    • Use Migration Networking for the selected vDS.

    • Assign the portgroup used for management in the vDS to VMK0.

  8. Verify management network connectivity.

  9. Re-assign the remaining uplinks from vSwitch0 to the desired vDS.

Post-workaround:

  • After completing the workaround, the ESXi host should be functional again.

  • Re-imaging the host is highly recommended to avoid any unexpected effects of removing NSX VIBs using incorrect method.