This article provides information on Frequently Asked Questions that can help to resolve Fault Tolerance related issues.
Environment
VMware vSphere ESXi 7.0
VMware vSphere ESXi 8.0
Resolution
What is VMware Fault Tolerance? VMware Fault Tolerance is a feature that allows a new level of guest redundancy. For more information regarding this feature, see the vSphere Availability Guide for your version of ESXi/ESX.
How do I turn it on? The feature is enabled on a per virtual machine basis. For more information on enabling Fault Tolerance, see the Turning on Fault Tolerance for Virtual Machines section in the vSphere Availability Guide for your version of ESXi/ESX.
What happens when I turn on Fault Tolerance? In general terms, a second virtual machine is created to work in tandem with the virtual machine on which you have enabled Fault Tolerance. This virtual machine resides on a different host in the cluster and runs in virtual lockstep with the primary virtual machine. When a failure is detected, the second virtual machine takes the place of the first one with the least possible interruption of service.
Why can't I turn on Fault Tolerance? VMware Fault Tolerance can be enabled on any virtual machine that resides in a cluster that meets the necessary requirements.
How do I turn Fault Tolerance off? For instructions on disabling Fault Tolerance, see Turn Off Fault Tolerance.
What happens during a failure? When a host running the Primary virtual machine fails, a transparent failover occurs to the corresponding Secondary virtual machine. During this failover, there is no data loss or noticeable service interruption. In addition, VMware HA automatically restores redundancy by restarting a new Secondary virtual machine on another host. Similarly, if the host running the Secondary virtual machine fails, VMware HA starts a new Secondary virtual machine on a different host. In either case there is no noticeable outage.
What is the logging time delay between the Primary and Secondary Fault Tolerant virtual machines? The actual delay is based on the network latency between the Primary and Secondary. vLockstep executes the same instructions on the Primary and Secondary. Because this happens on different hosts, there could be a small latency. However, there is no loss of state. This is typically less than 1 millisecond (ms). Fault Tolerance includes synchronization to ensure that the Primary and Secondary are synchronized.
In a cluster with more than 3 hosts, can you tell Fault Tolerance where to put the Fault Tolerant virtual machine or does it choose on its own? You can place the original or Primary virtual machine. You have full control with DRS or vMotion to assign it to any node. The placement of the Secondary, when created, is automatic based on the available hosts. But, when the Secondary is created and placed, you can vMotion it to the preferred host.
What happens if the host containing the Primary virtual machine comes back online (after a node failure)? This node is put back in the pool of available hosts. There is no attempt to start or migrate the Primary to that host.
Is the failover from the Primary virtual machine to the Secondary virtual machine dynamic or does Fault Tolerance restart a virtual machine? The failover from the Primary to Secondary virtual machine is dynamic with the Secondary continuing execution from the exact point where the Primary left off. It happens automatically with no data loss, no downtime, and little delay. Clients see no interruption. After the dynamic failover to the Secondary virtual machine, it becomes the new Primary virtual machine. A new Secondary virtual machine is spawned automatically.
Where are Fault Tolerance failover events logged? All failover events are logged by vCenter Server.
I encountered an error message that I can't find in the Knowledge Base. Where else should I check? For the list of known errors in the Fault Tolerance Error Messages, see the Troubleshooting Fault Tolerant Virtual Machines.
Does Fault Tolerance support Intel Hyper-Threading Technology? Yes, Fault Tolerance does support Intel Hyper-Threading Technology on systems that have it enabled. Enabling or disabling Hyper-Threading has no impact on Fault Tolerance.
What happens if vCenter Server is offline when a failover event occurs? When Fault Tolerance is configured for a virtual machine, vCenter Server need not be online for FT to work. Even if vCenter Server is offline, failover still occurs from the Primary to the Secondary virtual machine. Additionally, the spawning of a new Secondary virtual machine also occurs without vCenter Server.
Will there be a performance impact if I enable FT on VMs? There will not be performance dip on enabling FT on VMs, provided that the resources are provisioned enough to run both primary and secondary VMs.
Additional Information
For more information on Features and Devices Incompatible with Fault Tolerance, see the Fault Tolerance Interoperability section in the: