VMware vSphere High Availability host isolation response types
search cancel

VMware vSphere High Availability host isolation response types

book

Article ID: 322784

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

This article provides information on VMware vSphere High Availability (HA) host Isolation Response types and how a virtual machine responds to a host that has a heartbeat failure. It also describes the various VMware HA configuration settings.

Symptoms:

  • VMs may remain powered on, shut down, or reboot based on the HA isolation response.
  • Hosts are marked as isolated or failed in vCenter.
  • Loss of heartbeat communication between cluster hosts.
  • Cluster alarms related to HA host isolation are triggered.
  • Hosts fail to ping isolation addresses.

 

Environment

VMware vCenter Server 4.0.x
VMware vCenter Server 4.1.x
VMware vCenter Server 5.0.x
VMware vCenter Server 5.1.x
VMware vCenter Server 5.5.x
VMware vCenter Server 6.0.x
VMware vCenter Server 6.5.x
VMware vCenter Server 6.7.x
VMware vCenter Server 7.0.x

Resolution

Host network isolation occurs when a host is still running but it can no longer communicate with other hosts in the cluster and it cannot ping the configured isolation addresses. When the HA agent on a host loses contact with the other hosts, it will ping the isolation addresses. If the pings fail, the host will declare itself isolated.

HA Response Time

In VMware vSphere 5.x and 6.x, if the agent is primary, then isolation is declared in 5 seconds. If it is secondary, isolation is declared in 30 seconds.
 
In vSphere 4.x, isolation is declared in 12 seconds after heartbeats have ceased to arrive. 15 seconds after the start of the isolation event, other hosts in the cluster consider that the isolated host has failed and will initiate the isolation response workflow. You can change these default timeout values using VMware HA advanced options in VMware vCenter Server. The default isolation response is set to "shutdown".

HA Response Types

Leave powered on – When a network isolation occurs on the host, the state of the virtual machines remain unchanged and the virtual machines on the isolated host continue to run even if the host can no longer communicate with other hosts in the cluster. This setting also reduces the chances of a false positive. A false positive in this case is an isolated heartbeat network, but a non-isolated virtual machine network and a non-isolated iSCSI/NFS network. Should the host become unresponsive or fail and can no longer access/run the virtual machines, the virtual machines will be registered and powered on by another running host in the cluster. By default, the isolated host leaves its virtual machines powered on.

Power off – When a network isolation occurs, all virtual machines are powered off and restarted on another ESXi host. It is a hard stop. A power off response is initiated on the fourteenth second and a restart is initiated on the fifteenth second.

Shut down – When a network isolation occurs, all virtual machines running on that host are shut down via VMware Tools and restarted on another ESXi host. If this is not successful within 5 minutes, a power off response type is executed.


Additional Information

For more information on VMware vSphere HA, see the vSphere Availability Guide  and Respond to Host Isolation

VMware Blog by Duncan Epping
vSphere HA isolation response… which to use when?