When an application generates heavy I/O, Windows OS might log numerous Event ID 153 entries due to a malfunctioning storage subsystem.
Here's an example of Event ID 153.
| Log Name: System Source: disk Event ID: 153 Level: Warning Description: The IO operation at logical block address 123456 for Disk 2 was retried. |
VMware vSphere ESXi
The issue can occur due to the following reasons:
YYYY-MM-DDT00:00:00.000Z cpu33:13940214)Fil6: 4094: 'DATASTORE': Fil6 file IO () : TimeoutYou can retrieve information about the HBA interfaces by running the command,
localcli storage san fc stats get
Here is an example of the output showing "Invalid CRC Count."
FcStat: Adapter: vmhba2 Tx Frames: 1549904085 Rx Frames: 2846844196 Lip Count: 3 Error Frames: 0 Dumped Frames: 0 Link Failure Count: 0 Loss of Signal Count: 2 PrimSeq Protocol Err Count: 0 Invalid Tx Word Count: 608297 Invalid CRC Count: 5201 Input Requests: 57621286 Output Requests: 244055658 Control Requests: 0 |
If the invalid CRC counter increases, there could be an issue within the SAN fabric. Work with the hardware vendor, replace GBIC/SFP or cables to isolate the issue.
Also review the performance of the storage subsystem to further review the cause of the IO time out.