Symptoms:
yyyy-mm-ddThh:mm:ss [4F1E1B70 info 'Vimsvc.ha-eventmgr'] Event 205 : Lost access to volume 54f89e21-########-####-##########98 (228.154.ds3) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
yyyy-mm-ddThh:mm:ss [4F480B70 info 'Vimsvc.ha-eventmgr'] Event 210 : Successfully restored access to volume 54f89e21-########-####-##########98 (example datastore) following connectivity issues
yyyy-mm-ddThh:mm:ss: [vmfsCorrelator] 115715089142us: [esx.problem.vmfs.heartbeat.timedout] 54f89e21-########-####-##########98 example datastore
yyyy-mm-ddThh:mm:ss: [vmfsCorrelator] 115740470730us: [esx.problem.vmfs.heartbeat.recovered] 54f89e21-########-####-##########98 example datastore
In the /var/log/vmkernel.log file, you see entries similar to:
yyyy-mm-ddThh:mm:ss cpu10:36273)HBX: 2832: Waiting for timed out [HB state abcdef02 offset 3444736 gen 549 stampUS 115704005679 uuid 5592d754-21d7d8a7-0a7e-##########98 jrnl <FB 779600> drv 14.60] on vol 'example datastore'
yyyy-mm-ddThh:mm:ss cpu26:32873)HBX: 258: Reclaimed heartbeat for volume 54f89e21-########-####-##########98 (example datastore): [Timeout] Offset 3444736
Lost access to volume 54f89e21-########-####-##########98 (example datastore) due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.
VMware vSphere ESXi 6.x
VMware vSphere ESXi 7.x
VMware vSphere ESXi 8.x
If Lost connection to volume warnings are reported repeatedly at certain times, or at regular intervals, check in addition, whether there any I/O intensive scheduled tasks which may be degrading I/O performance to the point that datastore heartbeat I/Os time out, e.g.:
Changing the scheduling or distribution of such tasks may prevent loss of connection to volumes.
VMware Skyline Health Diagnostics for vSphere - FAQ
When the volume is in the lost access to volume state, host I/O is blocked until the heartbeat I/O can be completed. When the first heartbeat time out generates, you can issue subsequent heartbeat reclaim operations to the datastore until the heartbeat can be recovered. The reclaim occurs approximately once every second. Guest operating system should remain online as long as it can sustain the long latency periods of these I/O operations to the VMDK. Until the heartbeat is reclaimed, VMFS fails all virtual machine I/O operations from virtual machines residing on the impacted datastore with a DEVICE BUSY status. For more information, see: