I/O to virtual disk (VMDK) files that reside on the offline device might slow down or fail. This might cause virtual machines that have VMDKs residing on that device to become unresponsive or even fail.
Hostd may go unresponsive.
Note: This problem does not occur when a non-head extent of the spanned VMFS datastore fails along with the head extent. In this case, the entire datastore becomes inaccessible and no longer allows I/Os.
In contrast, when only a non-head extent fails, but the head extent remains accessible, the datastore heartbeat appears to be normal. And the I/Os between the host and the datastore continue. However, any I/Os that depend on the failed non-head extent start failing as well. Other I/O transactions might accumulate while waiting for the failing I/Os to resolve and cause the host to enter the non-responding state.
See, for example:
VMware vCenter Server 7.0 Update 3p Release Notes
Storage adapter rescan fails with the error "An error occurred while communicating with remote host".
In the var/run/log/vmkernel.log file, similar entries are seen:
YYYY-MM-DDTHH:MM.SSSZ Wa(180) vmkwarning: cpu##:2100497 opID=55dbbda8)WARNING: LVM: 17711: An attached device went offline. naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 file system [VOLUME-NAME, VOLUME-UUID]YYYY-MM-DDTHH:MM.SSSZ Wa(180) vmkwarning: cpu##:2100512 opID=b09639a4)WARNING: LVM: 17711: An attached device went offline. naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 file system [VOLUME-NAME, VOLUME-UUID]YYYY-MM-DDTHH:MM.SSSZ Wa(180) vmkwarning: cpu##:2100491 opID=26701d75)WARNING: LVM: 17711: An attached device went offline. naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 file system [VOLUME-NAME, VOLUME-UUID]YYYY-MM-DDTHH:MM.SSSZ Wa(180) vmkwarning: cpu##:2100491 opID=26701d75)WARNING: LVM: 17711: An attached device went offline. naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 file system [VOLUME-NAME, VOLUME-UUID]YYYY-MM-DDTHH:MM.SSSZ Wa(180) vmkwarning: cpu##:2100509 opID=c2db1724)WARNING: LVM: 17711: An attached device went offline. naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 file system [VOLUME-NAME, VOLUME-UUID]
var/run/log/vobd.log file, similar entries are seen:YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098027]: [vmfsCorrelator] 13900876943838us: [vob.vmfs.extent.offline] An attached device went offline. naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 file system [VOLUME-NAME, VOLUME-UUID]YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098027]: [vmfsCorrelator] 13900800936747us: [esx.problem.vmfs.extent.offline] An attached device naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 may be offline. The file system [VOLUME-NAME, VOLUME-UUID] is now in a degraded state. While the datastore is still available, parts of data that reside on the extent that went offline might be inaccessible.YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098027]: [vmfsCorrelator] 13900887878275us: [vob.vmfs.extent.offline] An attached device went offline. naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 file system [VOLUME-NAME, VOLUME-UUID]YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098027]: [vmfsCorrelator] 13900811870939us: [esx.problem.vmfs.extent.offline] An attached device naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx may be offline. The file system [VOLUME-NAME, VOLUME-UUID] is now in a degraded state. While the datastore is still available, parts of data that reside on the extent that went offline might be inaccessible.YYYY-MM-DDTHH:MM.SSSZ In(14) vobd[2098027]: [vmfsCorrelator] 13900966570083us: [esx.problem.vmfs.extent.offline] An attached device naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 may be offline. The file system [VOLUME-NAME, VOLUME-UUID] is now in a degraded state. While the datastore is still available, parts of data that reside on the extent that went offline might be inaccessible.
vmkfstools -Ph /vmfs/volumes/<datastore>[root@xxxxx:~] vmkfstools -Ph /vmfs/volumes/<datastore>VMFS-6.82 (Raw Major Version: 24) file system spanning 4 partitions.File system label (if any): datastoreMode: publicCapacity 399.8 GB, 156.2 GB available, file block size 1 MB, max supported file size 64 TBDisk Block Size: 512/16384/0UUID: ########-########-####-############Partitions spanned (on "lvm"): naa.################################:1 ----------> Head extent. naa.################################:1 naa.################################:1 naa.################################:1Is Native Snapshot Capable: NOnaa.id entries appear under "Partitions spanned (on 'lvm')," indicating that multiple devices are backing this datastore.Verify if this is a local datastore or SAN device.
Run the following command and identify using device naaid
esxcli storage core device list
naa.xxxxxxxxxxxxxxxx: Display Name: Local Make Disk (naa.xxxxxxxxxxxxxxxxxx) Has Settable Display Name: true Size: 3662830 Device Type: Direct-Access Multipath Plugin: HPP Devfs Path: /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxx Vendor: "Vendor Name" Model: XXX5XVUG3T84 Revision: B70C SCSI Level: 6 Is Pseudo: false Status: on Is RDM Capable: true Is Local: true
VMware vSphere ESXi 8.x
VMware vSphere ESXi 7.x
The esx.problem.vmfs.extent.offline message is received when an ESXi host loses connection to a storage device that backs an entire VMFS datastore or any of its extents.
This loss of connection can happen when a switch or cable that connects the device to the ESXi host is disconnected or when the device is reformatted to be used by another volume.
Identify the devices that are affected and restore connectivity. If the device has been reformatted and reassigned to another volume, the corresponding portion of the original volume will be permanently lost and cannot be recovered.
esxcli storage core device physical get -d <device Naaid>
[root@xxxxx:~] esxcli storage core device physical get -d naa.xxxxxxxxxxxxxxxPhysical Location: enclosure 12 slot 4
Involve storage vendor to identify and fix the underlying storage / disk issue.
If the issue continues after resolving the underlying storage problem, restart the host.