One or more virtual machines became unresponsive or were automatically powered off by vCenter. The following behaviors were observed:
The lock protecting virtualdisk.vmdk has been lost
vmhba64, repeatedly toggling between ONLINE and OFFLINE states./var/run/log/vmkwarning.log file recorded iSCSI connection events, such as:2025-07-19T00:01:26.262Z Wa(180) vmkwarning: cpu0:2097901)WARNING: iscsi_vmk: iscsivmk_StopConnection:736: vmhba64:CH:1 T:1 CN:0: iSCSI connection is being marked "OFFLINE" (Event:6)2025-07-19T00:01:29.554Z Wa(180) vmkwarning: cpu14:2098541)WARNING: iscsi_vmk: iscsivmk_StartConnection:918: vmhba64:CH:1 T:1 CN:0: iSCSI connection is being marked "ONLINE"2025-07-19T00:01:40.178Z Wa(180) vmkwarning: cpu14:2098541)WARNING: iscsi_vmk: iscsivmk_StopConnection:736: vmhba64:CH:1 T:1 CN:0: iSCSI connection is being marked "OFFLINE" (Event:4)2025-07-19T00:01:43.223Z Wa(180) vmkwarning: cpu14:2098541)WARNING: iscsi_vmk: iscsivmk_StartConnection:918: vmhba64:CH:1 T:1 CN:0: iSCSI connection is being marked "ONLINE"
/var/run/log/vmkernel.log recorded “state in doubt” messages for LUNs2025-07-19T00:01:29.406Z Wa(180) vmkwarning: cpu8:2097923)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:235: NMP device "naa.600507################000000000" state in doubt; requested fast path state update...2025-07-19T00:01:40.183Z Wa(180) vmkwarning: cpu0:2097248)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:235: NMP device "naa.600507################000000000" state in doubt; requested fast path state update...2025-07-19T00:01:40.185Z Wa(180) vmkwarning: cpu14:2097929)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:235: NMP device "naa.60050################000000002" state in doubt; requested fast path state update...2025-07-19T00:01:40.398Z Wa(180) vmkwarning: cpu49:2097964)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:235: NMP device "naa.60050################000000002" state in doubt; requested fast path state update...
Multiple SCSI command failures were logged with host statuses:
Sense code 0x6/0x29/0x00:
2025-07-19T00:00:06.394Z In(182) vmkernel: cpu30:2097301)ScsiDeviceIO: 13394: Task mgmt request issued to device naa.600507################00000000 is stuck (WorldID 2097224, Cmd 0x89, CmdSN c5dcc4). Issuing yellow notification to the application
2025-07-19T00:00:12.833Z In(182) vmkernel: cpu6:16065429)NMP: nmp_ThrottleLogForDevice:3898: H:0x5 D:0x0 P:0x0 . Act:NONE. cmdId.initiator=0x4538f3c9bb58 CmdSN 0x0
2025-07-19T00:00:12.833Z In(182) vmkernel: cpu2:2097262)ScsiDeviceIO: 4633: Cmd(0x45d94f039140) 0x88, CmdSN 0xffffdf8a41eb9d80 from world 2102628 to dev "naa.600507################00000000" failed H:0x2 D:0x0 P:0x0
2025-07-19T00:00:12.833Z In(182) vmkernel: cpu5:2097242)ScsiDeviceIO: 4605: Cmd(0x45d94f09a940) 0x88, cmdId.initiator=0x430ba92b0900 CmdSN 0xffffdf8a2ceb77c0 from world 2102628 to dev "naa.600507################00000000" failed H:0x8 D:0x0 P:0x0 Cancelled from driver layer
2025-07-19T00:00:12.834Z In(182) vmkernel: cpu32:2097947)NMP: nmp_ThrottleLogForDevice:3893: Cmd 0xa3 (0x45b951f296c0, 0) to dev "naa.600507################000000000" on path "vmhba64:C1:T1:L0" Failed:2025-07-19T00:00:12.834Z In(182) vmkernel: cpu32:2097947)NMP: nmp_ThrottleLogForDevice:3898: H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0. Act:NONE. cmdId.initiator=0x4538f3c9bbc8 CmdSN 0x0
VMware ESXi 7.x
VMware ESXi 8.x
The issue occurred due to transient loss of connectivity between the ESXi host and its iSCSI datastore caused by path flapping. This led to:
I/O command timeouts, aborts, or resets
Loss of VMDK file lock ownership by the ESXi host
Guest VMs becoming unresponsive or powered off by vCenter to prevent potential data corruption
To resolve and prevent recurrence of this issue, follow these steps:
Review Storage Array Connectivity
Engage your storage vendor to investigate the cause of repeated iSCSI path flapping.
Review array-side logs for signs of port errors, resets, or failovers.
Validate VMkernel Port Configuration
Ensure the number of VMkernel ports used for iSCSI does not exceed the number of physical NICs.
Avoid overprovisioning software iSCSI paths without sufficient physical NICs.
Check MTU Consistency
Ensure MTU (e.g., 1500 or 9000) is uniformly configured across all iSCSI network components (VMkernel ports, physical switches, storage ports).
Follow Best Practices
Refer to Broadcom/VMware networking best practices for software iSCSI: Broadcom KB: Networking Best Practices for iSCSI
Collect a TCP dump during the failure window.
Share the capture with your storage vendor for packet-level diagnosis.