vCenter network connectivity lost - Recover vCenter network when connected to a Distributed Switch
search cancel

vCenter network connectivity lost - Recover vCenter network when connected to a Distributed Switch

book

Article ID: 318719

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

This guide explains how to build a temporary Standard Switch (vSS) to connect the vCenter VM for recovery from a port disconnection or network outage.
Once vCenter is back online, necessary changes can be made in the distributed switch (vDS) to reconnect vCenter. 
Most of the steps are only possible via command line.

  • Management Network only exists in a Distributed Switch
  • There are no Ephemeral Ports in the cluster
  • vCenter Server lost network connectivity after an unplanned or planned outage
  • vCenter cannot be reconnected to a Distributed Switch port group on the same host, or if it is registered on a different host.
  • Unable to open the vCenter vSphere Client to make any changes to the network because vCenter is disconnected.
  • vCenter and vSAN on vDS (vCenter VM on vSAN Datastore) with change to network config resulting in vSAN datastores going offline. (standard switches have to be built for both MGT and vSAN with VLANs)
  • The below error appears when trying to modify vDS port adapter settings on any ESXi host. Also, it can show up when attempting to change network adapters for an ESXi host connected to a vDS with non-ephemeral ports:
Addition or reconfiguration of network adapters attached to non-ephemeral distributed virtual port groups is not supported.

Environment

VMware vSphere ESXi
VMware vCenter Server

Cause

If vCenter to host communication is lost, VMs will not be able to be reconfigured to static (also known as non-ephemeral) port groups on the vDS because vCenter is unavailable to give the VM a port binding.

VMware recommends to configure an Ephemeral Port Binding dvportgroup for the VC's management network in the environment to avoid this issue from happening again.
For more information, see: Static (non-ephemeral) or ephemeral port binding on a vSphere Distributed Switch.

Please note: if LACP is configured on the physical switch, the LACP configuration will need to be temporarily broken to remove a NIC from the vDS; in order to be available for the vCenter VM to use on the standard switch (unless non-LACP NICs are available and configured to pass the traffic). If the LACP configuration is not broken before moving the NIC off the vDS, this can cause further issues to the environment's connectivity.
This will require engagement from network team managing upstream switches to change the port config.

Before proceeding, please also ensure access to the DCUI/iLO/iDRAC for the host where the below steps will be performed.

Resolution

  1. Remove a vmnic located in the vDS managing the vCenter VM's network / VLAN:  

      1. Identify the Port ID#, vmnic#, and vDSName where the vmnic to be removed is connected to the vDS.
        # esxcli network vswitch dvs vmware list

        Sample output:

           Name: vDSName
           VDS ID: ########   Class: vswitch   Num Ports: ####   Used Ports: ##
           Configured Ports: ##
           MTU: 9000/1500
           CDP Status: listen
           Beacon Timeout: -/+#
         Uplinks: vmnicX, vmnicX
           VMware Branded: true
           DVPort:
                 Client: vmnicX
                 DVPortgroup ID: dvportgroup-###
                 In Use: true
                 Port ID: ##
      2. Remove the vmnic
        1.  esxcfg-vswitch -Q <vmnicX> -V <PortIDX> <vDSName>
        2. Example using vmnic1, Port ID 12 and vDS Name ProdSwitchvDS
          # esxcfg-vswitch -Q vmnic1 -V 12 ProdSwitchvDS
    1. Create a Standard Switch, a Portgroup, add the vmnic to the Standard Switch

      1. Create a Standard switch
        #esxcli network vswitch standard add --vswitch-name=<vSwitchName>
      2. Create a Portgroup

        #esxcli network vswitch standard portgroup add --portgroup-name=<PortgroupName> --vswitch-name=<vSwitchName>


      3. Add a vmnic to the Standard Switch

        #esxcli network vswitch standard uplink add --uplink-name=<vmnic#> --vswitch-name=<vSwitchName>


      4. If the vmnics associated with the network are VLAN trunks on the physical switchport, a VLAN ID for the corresponding standard portgroup will need to be applied as well. To set or correct the VLAN ID required for connectivity on a Standard vSwitch, run this command:

        #esxcli network vswitch standard portgroup set --portgroup-name=<PortgroupName> --vlan-id <VLAN>


    2. Recover vCenter's Virtual Machine network connectivity

      First, we will connect vCenter's virtual machine to the new Standard Switch Portgroup. This will help to regain network access to vCenter. This will also allow the ESXi hosts to connect back to vCenter Server, and management of the infrastructure will be possible again.
      1. Login to the ESXi vSphere Client with the admin credentials 
      2. Go to "Virtual Machines"
      3. Check vCenter's Virtual Machine
      4. Assign vCenter Nic to the newly created Standard Switch Port group. 
      5. Click Save 

        Note: Up until now, vCenter's network connectivity should have been recovered, and connection to its vSphere Client should now be possible. If connection is still not possible, ensure the Standard Switch Portgroup has the correct VLAN and MTU configuration.

                          Confirm that everything appears correct in the vCenter Inventory, navigate to the VDS and utilize the add manage hosts wizard to migrate vCenter and uplink/s back to the vDS and restore the configuration to its state prior to the outage.

                 4. Migrate uplink/vmnic back to the original vDS.

                       Restore the vmnic back to the vDS by following these steps:

      1. If not logged in the vCenter vSphere Client already, login to vCenter with Administrator credentials.
      2. Go to the vCenter's virtual machine, right click it and select "Edit Settings"
      3. Connect Network Adapter 1 to the Management Distributed Switch Portgroup
      4. Click OK
      5. Make sure to not lose network access again. If everything is ok after a couple of minutes, continue to the next steps.
      6. Migrate the vmnic and vmk back to the vDS
        1. If not logged in the vCenter vSphere Client already, login to vCenter with Administrator credentials.
        2. Go to the "Networking" tab.
        3. Right click the vDS and select "Add and Manage Hosts"
        4. Select "Manage host networking" and click Next
        5. Click "Attached hosts..."
        6. Check the ESXi host with the vmk and vmnic to be added back to the vDS, and click OK.
        7. Click Next
        8. On the "Manage physical adapters" list, select the vmnic and click "Assign uplink"
        9. Select an Uplink with empty "Assigned Adapter" and click OK
        10. Click Next
        11. Click Next in "Manage VMkernel adapters"
        12. Click Next in "Migrate VM networking"
        13. Click Finish

                  5. Delete the Standard Switch.

                       The temporary Standard Switch created to recover vCenter's network connectivity can now be deleted.

      1. If not logged in the vCenter vSphere Client already, login to vCenter with Administrator credentials.
      2. Go to the "Hosts and Clusters" Tab
      3. Select the ESXi used during this process
      4. Click in the "Configure" tab
      5. Click on "Virtual Switches" under "Networking"
      6. Look for the temporary Standard Switch that was created and click the ellipsis "..."
      7. Click "Remove"
      8. Click "Yes" in the warning that pops up

Additional Information

Static (non-ephemeral) or ephemeral port binding on a vSphere Distributed Switch
Configuring vSwitch or vNetwork Distributed Switch from the command line in ESXi

Impact/Risks:
There should be at least 2 vmnics used for the Management Network because in one of the steps we will remove one vmnic from the vDS Management Portgroup so that we can use it for the Standard Switch that will be create temporarily.

NOTE: If the vmnics are in an LACP configuration, it will be required to remove at least one uplink from LACP port channel from physical switch end. The same vmnic can be then removed from DVS using CLI command. Follow this KB Enable EtherChannel / Link Aggregation Control Protocol (LACP) in ESXi/vCenter for steps on how to work with an LACP configuration.

If there are fewer than 2 vmnics in the vDS, it is recommended to follow these steps via the DCUI Shell. Otherwise, access to SSH will be lost when running the remove vmnic command, preventing continuation of the process.