vSphere High Availability (HA) fails to restart a virtual machine when VMware vSAN network is partitioned
search cancel

vSphere High Availability (HA) fails to restart a virtual machine when VMware vSAN network is partitioned

book

Article ID: 319924

calendar_today

Updated On: 02-25-2025

Products

VMware vSAN

Issue/Introduction

Symptoms:
When a VMware vSAN network is partitioned, you may experience these symptoms:
  • vSphere High Availability (HA) fails to restart virtual machines protected by HA
  • vSphere High Availability (HA) cannot restart virtual machines protected by HA
  • Virtual machines do not power on as expected


Environment

VMware vSAN 7.0.x

Cause

This issue occurs when vSAN uses VMkernel adapters for internode communication, which are on the same subnet/VLAN as other VMkernel adapters in a cluster. Such a configuration can cause network failure and disrupt vSAN internode communication, while vSphere HA internode communication remains unaffected.

vSAN uses its own logical network. When vSAN and vSphere HA are enabled for the same cluster, the HA interagent traffic flows over this storage network rather than the management network. vSphere HA uses the management network only when vSAN is disabled. For more details see Using vSAN and vSphere HA.

However does not guarantee restart of a VM even if there is quorum, due to the dependency of most recent copy. 

In this situation, the HA master agent may detect the failure in a virtual machine but is unable to restart it. For example, this can occur when the host on which the master agent is running does not have access to the virtual machine objects.

Resolution

Ensure the VMkernel adapters used by vSAN do not share the same subnet/VLAN with the VMkernel adapters used for other purposes such as Management, vMotion, Replication, NFS, etc.

Note: It is recommended to set the HA isolation response to Power Off for virtual machines protected by HA on a vSAN datastore.