VMs are down and sluggish with operations like vMotion and Snaphot Consolidation gets stuck.
search cancel

VMs are down and sluggish with operations like vMotion and Snaphot Consolidation gets stuck.

book

Article ID: 393490

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Symptoms:

  • VM related tasks like power on, consolidation gets are stuck.You above observe below consolidation task failure under VM>Monitor>Tasks

  • VM's report extreme slow or down with blank screen on the console.
  • I/O aborts with sense code H:0x5 and "state in doubt" messages are observed in /var/run/log/vmkernel.log

2025-03-26T21:47:48.484Z cpu14:2098229)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237: NMP device "naa.####################" state in doubt; requested fast path state update...
2025-03-26T21:47:48.484Z cpu14:2098229)ScsiDeviceIO: 4154: Cmd(0x45b8d873abc8) 0x89, cmdId.initiator=0x430676466740 CmdSN 0x210dc from world 2097225 to dev "naa.###################" failed H:0x5 D:0x0 P:0x0 Cancelled from driver layer.
2025-03-26T21:47:48.484Z cpu14:2098229)Cmd count Active:3 Queued:4

  • Virtual reset observed in /var/run/log/vmkernel.log

2025-03-26T21:47:48.484Z cpu1:2097604)lsi_msgpt3: _scsih_task_mgmt:962: lsi_msgpt3_0:C0:T1:L1 handle(0x0009): TM(abort) request end: status=Success
2025-03-26T21:47:48.484Z cpu3:2098261)lsi_msgpt3: _scsih_virtual_reset:773: lsi_msgpt3_0: Virtual reset request start on the path C0:T1:L1, handle(0x0009). initiator(0x430ad7a15800):worldId(0x201cf9)

  • You may see Driver aborts with sense code H:0x8 with SCSI Aborts in /var/run/log/vmkernel.log

2025-03-26T21:47:48.479Z cpu1:2097604)lsi_msgpt3: _scsih_abort:735: lsi_msgpt3_0:C0:T1:L1, handle(0x0009), smid(40), io_in_time_ms(12866783), abort_in_time_ms(12867000), delta(217 ms)
2025-03-26T21:47:48.479Z cpu1:2097604)lsi_msgpt3: _scsih_abort:738: lsi_msgpt3_0:C0:T1:L1: opcode(0x89), cmdInitiator(0x430676466740):cmdSN(0x210dc):worldId(0x200049)
2025-03-26T21:47:48.479Z cpu1:2097604)lsi_msgpt3: msgpt_scsih_issue_tm:5168: lsi_msgpt3_0: sending tm: handle(0x0009), C0:T1:L1 task_type(0x01), smid(40)
2025-03-26T21:47:48.479Z cpu3:2098261)ScsiDeviceIO: 4087: Cmd(0x45b8db223f48) 0x28, cmdId.initiator=0x430ad7a15800 CmdSN 0x80000008 from world 2104569 to dev "naa.###################" failed H:0x8 D:0x0 P:0x0 Cancelled from NMP layer

  • Dropped frames with sense code H:0x2 are reported in /var/run/log/vmkernel.log

2025-03-26T22:03:10.525Z cpu57:2098273)NMP: nmp_ThrottleLogForDevice:3875: H:0x2 D:0x0 P:0x0 . Act:EVAL. cmdId.initiator=0x430ad7a12340 CmdSN 0x8000005b
2025-03-26T22:03:10.525Z cpu57:2098273)ScsiDeviceIO: 4115: Cmd(0x45d8d1c5d3c8) 0x8a, CmdSN 0x8000005b from world 2103003 to dev "naa.###################" failed H:0x2 D:0x0 P:0x0

  • Lost access to volume messages are seen in /var/run/log/hostd.log

2025-03-26T18:22:48.480Z info hostd[2100677] [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 119 : Lost access to volume ########### #### #### ########## **-DS01 due to connectivity issues. Recovery attempt is in progress and outcome will be reported shortly.

  • You may see "Power-on reset" with Heartbeat timeouts in /var/run/log/vobd.log

2025-03-26T19:13:51.263Z: [scsiCorrelator] 3629783840us: [vob.scsi.scsipath.por] Power-on Reset occurred on naa.#########################
2025-03-26T19:13:51.479Z: [vmfsCorrelator] 3630000259us: [vob.vmfs.heartbeat.timedout] ########### #### #### ########## **-DS01
2025-03-26T19:13:51.479Z: [vmfsCorrelator] 3630000497us: [esx.problem.vmfs.heartbeat.timedout]  ########### #### #### ########## **-DS01

  • Long VMFS rsv time on the datastore with virtual resets.
2025-07-14T21:18:36.140Z cpu38:2098119)FS3Misc: 1755: Long VMFS rsv time on 'DS#' (held for 5557 msecs). # R: 1, # W: 1 bytesXfer: 16 sectors
2025-07-14T21:18:42.586Z cpu38:2098119)lsi_mr3: mfi_TaskMgmt:690: Processing taskMgmt virt reset for device: vmhba1:C2:T0:L0

Environment

VMware vSphere ESXi 7.0.x
VMware vSphere ESXi 8.0.x

Cause

This issue is caused due to the faulty drive on the storage array.

Disk failures may lead to the congestions which may lead to operations like migration or consolidation failures or take longer time to complete.

Resolution

Engage the storage vendor to further investigate the issue.

Additional Information

Interpreting SCSI sense codes in VMware ESXi

"state in doubt; requested fast path state update" error in vmkernel.log

Lost access to volume due to connectivity issues OR Path redundancy to storage device degraded

Frequent Power On Reset Unit Attentions occur on path

  • SCSI event H 0x5 reported which states host status returned with "Abort",Driver has to abort commands in-flight to the target. This can occur due to a command timeout or parity error in the frame.
  • SCSI event H 0x8 reported which states host status returned with "Reset".HBA driver has aborted the I/O. It can also occur if the HBA does a reset of the target.
  • SCSI event H 0x2 reported which states host status returned when the HBA driver is unable to issue a command to the device.