Local Datastores show disconnected and vSAN disks report as Absent and storage pool unhealthy
search cancel

Local Datastores show disconnected and vSAN disks report as Absent and storage pool unhealthy

book

Article ID: 434803

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware Cloud Foundation

Issue/Introduction

Symptoms:

  • From Skyline Health, an Operation Health warning will be observed, where all disks is reported as Absent, and the StoragePool status will be marked as unhealthy

  • From disk management, the host reports more disks compared to disks present in the environment. 

  • Validation using vdq -iH shows that vSAN is unable to read disk metadata from multiple NVMe devices:

[root@esxihost:/] vdq -iH
VsanUtil::ReadFromDevice: Failed to open /vmfs/devices/disks/t10.NVMe____Dell_DC_NVMe_#####_RI_U.2_3.84TB________###########, errno (19)
VsanUtil::GetVsanDisks: Error occurred 'Failed to open device /vmfs/devices/disks/t10.NVMe____Dell_DC_NVMe_#####_RI_U.2_3.84TB________###########', create disk with null id

  • /var/log/hostd.log for property collection tasks will fail:

    2026-03-23T10:11:16.915Z In(166) Hostd[2100215]: [Originator@6876 sub=Vimsvc.TaskManager opID=867d34eb-e218 sid=52c32ab5 user=dcui:vsanmgmtd] Task Created : vmodlTask-ha-host-57179112
    2026-03-23T10:11:16.920Z In(166) Hostd[2100204]: [Originator@6876 sub=Http2ServerSession-6120] Starting Http2Session (server): <io_obj t:N7Vmacore6System19TCPSocketObjectAsioE, h:103, <TCP '127.0.0.1 : 8307'>, <TCP '127.0.0.1 : 60883'>>
    2026-03-23T10:11:16.923Z In(166) Hostd[2100206]: [Originator@6876 sub=Solo.Vmomi opID=e378e21e sid=52689d6a] Activation finished; <<52689d6a-214a-c8cc-f0c8-ad0b5eafcb0c, <TCP '127.0.0.1 : 8307'>, <TCP '127.0.0.1 : 60883'>>, ha-property-collector, vmodl.query.PropertyCollector.createPropertyCollector, <vim.version.v9_0_0_0, internal, 9.0.0.0>, [N11HostdCommon18VmomiAdapterServer19ActivationResponderE]>
    2026-03-23T10:11:16.923Z In(166) Hostd[2100206]: [Originator@6876 sub=Solo.Vmomi opID=e378e21e sid=52689d6a] Throw vim.fault.NotAuthenticated
    2026-03-23T10:11:16.923Z In(166) Hostd[2100206]: [Originator@6876 sub=Solo.Vmomi opID=e378e21e sid=52689d6a] Result:
    2026-03-23T10:11:16.923Z In(166) Hostd[2100171]: --> (vim.fault.NotAuthenticated) {
    2026-03-23T10:11:16.923Z In(166) Hostd[2100171]: -->    object = 'vmodl.query.PropertyCollector:ha-property-collector',
    2026-03-23T10:11:16.923Z In(166) Hostd[2100171]: -->    privilegeId = "System.View",
    2026-03-23T10:11:16.923Z In(166) Hostd[2100171]: -->    msg = "",
    2026-03-23T10:11:16.923Z In(166) Hostd[2100171]: --> }

  • The corresponding property collection tasks will have the below errors in /var/log/vmkernel.log:

2026-03-23T10:11:16.919Z In(182) vmkernel: cpu115:2285966 opID=be3c44c0)World: 12868: VC opID 867d34eb maps to vmkernel opID be3c44c0
2026-03-23T10:11:16.920Z In(182) vmkernel: cpu115:2285966 opID=be3c44c0)Device: 435: Unable to allocate device ID memory on heap 0x430a7cc00000
2026-03-23T10:11:16.920Z In(182) vmkernel: cpu115:2285966 opID=be3c44c0)Device: 435: Unable to allocate device ID memory on heap 0x430a7cc00000

  • Scheduler queue allocation for NVMe FDs will fail.
  • /var/log/vmkernel.log will report:

2026-03-23T10:10:12.648Z In(182) vmkernel: cpu85:2285967 opID=57ce6f15)World: 12868: VC opID e09e1465-3492 maps to vmkernel opID 57ce6f15
2026-03-23T10:10:12.648Z Wa(180) vmkwarning: cpu85:2285967 opID=57ce6f15)WARNING: StorageSched: 1739: StorSchedQAllocByType: could not allocate scheduler queue for world 0
2026-03-23T10:10:12.648Z In(182) vmkernel: cpu85:2285967 opID=57ce6f15)StorageSched: 2250: SchedQ creation failed for worldID 0 for t10.NVMe____Dell_BOSS2DN1____________________________#########
2026-03-23T10:10:12.664Z Wa(180) vmkwarning: cpu85:2285967 opID=57ce6f15)WARNING: NvmeFds: 1327: Memory allocation failed

  • VSI Node for claim rules will report "Out Of Memory" in /var/log/vmkernel.log:

2026-03-23T10:15:09.885Z In(182) vmkernel: cpu115:79005511)ScsiPathClaimVsi: 1534: Out of memory.
2026-03-23T10:15:45.026Z In(182) vmkernel: cpu76:79006379)ScsiPathClaimVsi: 1534: Out of memory.

  • There will be no impact on the VM or ESXi I/O path

Environment

  • VMware vSphere ESX 9.x
  • VMware Cloud Foundation 9.0

 

Cause

  • This issue occurs due to a memory leak in NVMe SMART statistics VSI node get calls.
  • Over time, this causes the storageHeap to run out of memory.
  • The property collector calls on all VSI nodes ultimately fail causing health service to report erratic behavior
  • /var/log/vsandevicemonitord.log show repeated errors while retrieving SMART health status for the vSAN disks:

2026-03-22T12:40:36Z In(14) vsandevicemonitord[2287920]: [1087063224832]: Exception Out of memory getting SMART health status for vSAN disk t10.NVMe____Dell_DC_NVMe_PM9A3_RI_U.2_3.84TB________#########.
2026-03-22T12:40:36Z In(14) vsandevicemonitord[2287920]: [1087063224832]: Exception Out of memory getting SMART health status for vSAN disk t10.NVMe____Dell_DC_NVMe_PM9A3_RI_U.2_3.84TB________#########.

2026-03-19T06:49:27Z In(14) vsandevicemonitord[2287920]: [1087063224832]: Exception Out of memory retrieving SMART percentage used for vSAN disk t10.NVMe____Dell_DC_NVMe_PM9A3_RI_U.2_3.84TB________#########.
2026-03-19T06:49:27Z In(14) vsandevicemonitord[2287920]: [1087063224832]: Exception Out of memory getting SMART health status for vSAN disk t10.NVMe____Dell_DC_NVMe_PM9A3_RI_U.2_3.84TB________#########.

  • Validation of storageHeap points to maximum heap size equal to the current allocated and high number of failed allocations

vsish -e get /system/heaps/storageHeap-0x430a7cc00000/stats
Heap stats {
   Name:storageHeap
   current heap size:430141864
   initial heap size:11534336
   current bytes allocated:430141768
   percent free of current size:0
   percent releasable of current size:0
   maximum heap size:430141864
   maximum bytes available:96
   
   # of failure messages:125282383
   number of succeeded allocations:482400875
   number of failed allocations:62610593
   number of freed allocations:481530200
   average size of an allocation:3221225472

  • Validation of NVMe SMART stats for the devices report "Out Of Memory".

# vsish -e get /storage/scsifw/devices/t10.NVMe____Dell_DC_NVMe_#####_RI_U.2_3.84TB________###########/nvmesmartstats
VSISHCmdGetInt():Get failed: Out of memory
Error: Error in command cat: Out of memory

Resolution

  • This issue will be fixed in VMware vSphere ESX 9.1 and VMware Cloud Foundation 9.1
  • As a workaround, reboot the affected ESXi hosts to temporarily restore functionality.