When an application generates heavy I/O, Windows OS might log numerous Event ID 153 entries due to a malfunctioning storage subsystem.
Here's an example of Event ID 153.
Log Name: System Source: disk Event ID: 153 Level: Warning Description: The IO operation at logical block address 123456 for Disk 2 was retried. |
VMware vSphere ESXi
The issue can occur due to the following reasons:
YYYY-MM-DDT00:00:00.000Z cpu33:13940214)Fil6: 4094: 'DATASTORE': Fil6 file IO () : Timeout
You can retrieve information about the HBA interfaces by running the command,
localcli storage san fc stats get
Here is an example of the output showing "Invalid CRC Count."
FcStat: Adapter: vmhba2 Tx Frames: 1549904085 Rx Frames: 2846844196 Lip Count: 3 Error Frames: 0 Dumped Frames: 0 Link Failure Count: 0 Loss of Signal Count: 2 PrimSeq Protocol Err Count: 0 Invalid Tx Word Count: 608297 Invalid CRC Count: 5201 Input Requests: 57621286 Output Requests: 244055658 Control Requests: 0 |
If the invalid CRC counter increases, there could be an issue within the SAN fabric. Work with the hardware vendor, replace GBIC/SFP or cables to isolate the issue.
Also review the performance of the storage subsystem to further review the cause of the IO time out.