2025-11-24T12:22:44.197Z In(14) vobd[2097954]: [vmfsCorrelator] 5027159999907us: [esx.problem.vmfs.heartbeat.
2025-11-24T12:23:31.128Z In(14) vobd[2097954]: [vmfsCorrelator] 5027206931363us: [esx.problem.vmfs.heartbeat.
VMware VSAN [All Versions]
The unreliable transient condition of the device increased the time taken to declare PDL .This caused the VMs to reach a threshold post which, they either became unresponsive or crashed.
Sequence of events:
2025-11-24T12:22:11+00:00 vmkernel: cpu72:2098044)NVMEIO:4776 cmd2Abort 0x45de9093b200, opcode 0x2, nsid 1, lba 1877732352, lbc 127
2025-11-24T12:22:17+00:00 vmkernel: cpu49:2098044)NVMEDEV:8260 Controller 257 state changed from 5 to 8(INRESET)2025-11-24T12:22:17+00:00 vmkernel: cpu49:2098044)NVMEDEV:8245 Resetting controller 257 (nqn.1994-11.com.samsung:nvme:#####:2.5-inch:############)2025-11-24T12:22:17+00:00 vmkernel: cpu49:2098044)NVMEIO:4623 Ctlr 257, abort commands stuck, escalate to controller reset
2025-11-24T12:22:32.237Z Wa(180) vmkwarning: cpu49:6775399)WARNING: NVMEIO:4011 Controller 256 in state 8 or in recovery mode, bail out.
2025-11-24T12:22:41.074Z In(14) vobd[2097955]: [vSANCorrelator] 4075554822606us: [esx.problem.vob.vsan.lsom.devicerepair] Device ########-####-####-####-############ is in offline state and is getting repaired.
2025-11-24T12:23:33.619Z No(00) Upcall-38797af - UNUSUAL: Successful write to '/vmfs/volumes/vsan:##############-###############/########-####-###-####-#########/vmware.log' took 41.631960 seconds.
2025-11-24T12:23:39.105Z In(05) vcpu-0 - Chipset: The guest has requested that the virtual machine be hard reset.
2025-11-24T12:24:30.757Z Wa(180) vmkwarning: cpu4:2097848)WARNING: StorageDevice: 11908: PDL set on device path vmhba6:C0:T0:L0
2025-11-24T12:24:24.586Z Wa(180) vmkwarning: cpu4:2097848)WARNING: StorageDevice: 11908: PDL set on device path vmhba7:C0:T0:L0
Engage the hardware vendor for transient issue with the NVMe drives to investigate potential hardware or firmware-related causes, as such errors often originate from underlying hardware issue.