Virtual machine becomes unresponsive with error: The lock protecting virtualdisk.vmdk has been lost due to iSCSI connectivity issues
search cancel

Virtual machine becomes unresponsive with error: The lock protecting virtualdisk.vmdk has been lost due to iSCSI connectivity issues

book

Article ID: 404898

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms

One or more virtual machines became unresponsive or were automatically powered off by vCenter. The following behaviors were observed:

  • Error message observed:

The lock protecting virtualdisk.vmdk has been lost

  • Affected VMs were automatically powered off by vCenter.
  • iSCSI path flapping was observed on vmhba64, repeatedly toggling between ONLINE and OFFLINE states.
The /var/run/log/vmkwarning.log file recorded iSCSI connection events, such as:

2025-07-19T00:01:26.262Z Wa(180) vmkwarning: cpu0:2097901)WARNING: iscsi_vmk: iscsivmk_StopConnection:736: vmhba64:CH:1 T:1 CN:0: iSCSI connection is being marked "OFFLINE" (Event:6)
2025-07-19T00:01:29.554Z Wa(180) vmkwarning: cpu14:2098541)WARNING: iscsi_vmk: iscsivmk_StartConnection:918: vmhba64:CH:1 T:1 CN:0: iSCSI connection is being marked "ONLINE"
2025-07-19T00:01:40.178Z Wa(180) vmkwarning: cpu14:2098541)WARNING: iscsi_vmk: iscsivmk_StopConnection:736: vmhba64:CH:1 T:1 CN:0: iSCSI connection is being marked "OFFLINE" (Event:4)
2025-07-19T00:01:43.223Z Wa(180) vmkwarning: cpu14:2098541)WARNING: iscsi_vmk: iscsivmk_StartConnection:918: vmhba64:CH:1 T:1 CN:0: iSCSI connection is being marked "ONLINE"

  • State-in-doubt messages were recorded for LUNs
The /var/run/log/vmkernel.log recorded “state in doubt” messages for LUNs

2025-07-19T00:01:29.406Z Wa(180) vmkwarning: cpu8:2097923)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:235: NMP device "naa.600507################000000000" state in doubt; requested fast path state update...
2025-07-19T00:01:40.183Z Wa(180) vmkwarning: cpu0:2097248)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:235: NMP device "naa.600507################000000000" state in doubt; requested fast path state update...
2025-07-19T00:01:40.185Z Wa(180) vmkwarning: cpu14:2097929)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:235: NMP device "naa.60050################000000002" state in doubt; requested fast path state update...
2025-07-19T00:01:40.398Z Wa(180) vmkwarning: cpu49:2097964)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:235: NMP device "naa.60050################000000002" state in doubt; requested fast path state update...

  • Multiple SCSI command failures were logged with host statuses:

    • [0x5] ABORT: Command timeout or parity errors
    • [0x2] BUS_BUSY: HBA unable to issue commands due to dropped frames
    • [0x8] RESET: I/O aborted or HBA reset occurred
  • Sense code 0x6/0x29/0x00:

    • Unit Attention – Power On, Reset, or Bus Device Reset Occurred

2025-07-19T00:00:06.394Z In(182) vmkernel: cpu30:2097301)ScsiDeviceIO: 13394: Task mgmt request issued to device naa.600507################00000000 is stuck (WorldID 2097224, Cmd 0x89, CmdSN c5dcc4). Issuing yellow notification to the application

2025-07-19T00:00:12.833Z In(182) vmkernel: cpu6:16065429)NMP: nmp_ThrottleLogForDevice:3898: H:0x5 D:0x0 P:0x0 . Act:NONE. cmdId.initiator=0x4538f3c9bb58 CmdSN 0x0

2025-07-19T00:00:12.833Z In(182) vmkernel: cpu2:2097262)ScsiDeviceIO: 4633: Cmd(0x45d94f039140) 0x88, CmdSN 0xffffdf8a41eb9d80 from world 2102628 to dev "naa.600507################00000000" failed H:0x2 D:0x0 P:0x0

2025-07-19T00:00:12.833Z In(182) vmkernel: cpu5:2097242)ScsiDeviceIO: 4605: Cmd(0x45d94f09a940) 0x88, cmdId.initiator=0x430ba92b0900 CmdSN 0xffffdf8a2ceb77c0 from world 2102628 to dev "naa.600507################00000000" failed H:0x8 D:0x0 P:0x0 Cancelled from driver layer

2025-07-19T00:00:12.834Z In(182) vmkernel: cpu32:2097947)NMP: nmp_ThrottleLogForDevice:3893: Cmd 0xa3 (0x45b951f296c0, 0) to dev "naa.600507################000000000" on path "vmhba64:C1:T1:L0" Failed:
2025-07-19T00:00:12.834Z In(182) vmkernel: cpu32:2097947)NMP: nmp_ThrottleLogForDevice:3898: H:0x0 D:0x2 P:0x0 Valid sense data: 0x6 0x29 0x0. Act:NONE. cmdId.initiator=0x4538f3c9bbc8 CmdSN 0x0

 

Environment

VMware ESXi 7.x

VMware ESXi 8.x

Cause

The issue occurred due to transient loss of connectivity between the ESXi host and its iSCSI datastore caused by path flapping. This led to:

  • I/O command timeouts, aborts, or resets

  • Loss of VMDK file lock ownership by the ESXi host

  • Guest VMs becoming unresponsive or powered off by vCenter to prevent potential data corruption

Resolution

To resolve and prevent recurrence of this issue, follow these steps:

  • Review Storage Array Connectivity

    • Engage your storage vendor to investigate the cause of repeated iSCSI path flapping.

    • Review array-side logs for signs of port errors, resets, or failovers.

  • Validate VMkernel Port Configuration

    • Ensure the number of VMkernel ports used for iSCSI does not exceed the number of physical NICs.

    • Avoid overprovisioning software iSCSI paths without sufficient physical NICs.

  • Check MTU Consistency

    • Ensure MTU (e.g., 1500 or 9000) is uniformly configured across all iSCSI network components (VMkernel ports, physical switches, storage ports).

  • Follow Best Practices

  • Perform Packet Capture (If Issue Persists)
    • Collect a TCP dump during the failure window.

    • Share the capture with your storage vendor for packet-level diagnosis.

Additional Information