Recovering a combined vCenter/vSAN Environment After Network Disconnection
search cancel

Recovering a combined vCenter/vSAN Environment After Network Disconnection

book

Article ID: 395503

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms may include: 
  • vCenter not reachable
  • VM offline
  • management of VDS
  • vCenter VM on VSAN storage
This article guides administrators through recovering a VMware vCenter and vSAN cluster after an upstream config change/outage or accidental misconfiguration on the vSphere Distributed Switch (VDS), which leads to host network isolation and vCenter unavailability.

Environment

  • vCenter becomes unresponsive and cannot be accessed via SSH or console. 
  • Full disk or swap errors observed in vCenter VM. 
  • VMs become unreachable; entire environment appears “down.” 
  • vSAN cluster health shows “Host Connectivity Issues,” “Primary Election Failures,” etc
  • CLI shows network interfaces down or disconnected. 
  • No rollback occurred after distributed switch uplink changes.  
  • Unable to reconfigure VDS uplinks due to vCenter vSAN datastore being on offline.

Cause

Misconfiguration of VDS uplinks at the host level across all nodes in the cluster.
Uplinks mapped to critical services (vSAN, management) were re-assigned without confirming their availability on the replacement links.
Because vCenter resides on vSAN, and vSAN lost inter-node connectivity, vCenter failed to mount its disks.

Resolution

Prerequisites Before Recovery

  • At least one administrator with iDRAC/ILO access to each host.
  • CLI access to ESXi hosts via SSH or local console.
  • Knowledge of the management and vSAN VLAN IDs.
  • Available IPs in management and vSAN subnets for temporary use.
  • Suggest starting separate vCenter restore from backup.
     

Resolution Steps

 1. Assess Current Host Connectivity

Run the following from each host:
  • esxcli vsan cluster get
  • esxcli vsan health cluster list
  • esxcli vsan network list
 
Check for signs of segmentation and verify which VMkernel (VMK) interfaces are used for vSAN.

2. Confirm Uplink Availability

List NICs and confirm uplinks not in use by VDS:
  • esxcfg-nics -l esxcfg-vswitch -l
Identify the uplinks not currently used by the DVS
 

3. Create Standard vSwitch on Each Host


4. Add New VMkernel Interfaces -                          Note: (Adjust commands to suit your vmk/nic numbering)

For Management:
  • esxcli network ip interface add -i vmk6 -p MGMT -v vSwitch-MGMT
  • esxcli network ip interface ipv4 set -i vmk6 -I <new-mgmt-IP> -N <subnet-mask> -t static
  • esxcli network ip interface tag add -i vmk6 -t Management
For vSAN:
  • esxcli network ip interface add -i vmk7 -p VSAN -v vSwitch-MGMT
  • esxcli network ip interface ipv4 set -i vmk7 -I <new-vsan-IP> -N <subnet-mask> -t static
  • esxcli vsan network ipv4 add -i vmk7
Validate with:
  • vmkping -I vmk7 <other-vsan-IP>

5. Verify vSAN Cluster Health

Once VMK pings succeed between all nodes:
  • esxcli vsan health cluster list
You should see recovery from segmentation, and the cluster election process should begin resolving.
 

6. Power on vCenter

Once vSAN is functioning and all required components are online, attempt to power on the vCenter VM from any host.
  • Reconnect vCenter to existing VDS.
    Migrate temporary standard switch VMKs back to VDS port groups.
    Remove temporary vSwitches once validated.
     

Additional Information