Guest VM reporting High IO wait time from guest performance monitoring tools (e.g. top)
You see I/O latency on the SAN LUN's being reported, but it is < 10 milliseconds . Below similar events are observed in /var/run/log/vmkwarning.log
2025-05-12T20:52:42.242Z Wa(180) vmkwarning: cpu111:2098313)WARNING: ScsiDeviceIO: 1772: Device naa.################################ performance has deteriorated. I/O latency increased from average value of 331 microseconds to 7318 microseconds.
/var/run/log/vmkernel.log contains failing SCSI commands with sense data: 0x2 0x4 0x3:
2025-05-13T19:24:45.797Z In(182) vmkernel: cpu58:2098312)ScsiDeviceIO: 4633: Cmd(0x45b9da7555c0) 0x1a, CmdSN 0x1f6667d from world 0 to dev "naa.################################" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x2 0x4 0x3
Below Storage paths degraded messages are seen in /var/run/log/vobd.log
2025-09-03T23:29:36.954Z: [scsiCorrelator] 5662461916856us: [esx.problem.storage.redundancy.degraded] Path redundancy to storage device naa.############################ degraded. Path vmhba3:C0:T1:L36 is down. Affected datastores: Unknown.
Environment
vSphere ESXi (all versions)
Cause
This is SAN side issue.
The presence of these errors can contribute to over storage responsiveness, as the commands to the problem LUN's are failing and have to be requeued (and will continue to fail repeatedly until the issue is resolved). This severity of issue will increase as the number of LUN's having issues increases.
NOTE: The LUN's reporting the 0x2 0x4 0x3 sense data may not be the ones where the problem VM resides.
Resolution
Please work with your SAN team and/or SAN vendor to identify the issues with the LUN's that are reporting the above sense data.
If needed, please open a case with Broadcom for further assistance, as well.
Additional Information
The "D:0x2" in the above line indicates that the SAN has reported a "check condition" from the device(target) end and that action should be taken from the SAN end to troubleshoot the problem further.
The Sense Data: 0x2 0x4 0x3 indicates a "LUN not ready" error.