Disk consolidation fails with "IO was aborted by VMFS via a virt-reset" due to remote ISL link errors
search cancel

Disk consolidation fails with "IO was aborted by VMFS via a virt-reset" due to remote ISL link errors

book

Article ID: 429270

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

 

  • Disk Consolidation Failure: Attempts to consolidate disks on one or more virtual machines fail in the vSphere Client.

  • vCenter Error: The task fails with a generic timeout message similar to:

    Consolidation failed for disk node 'scsi0:x': The operation failed.

  • ESXi Host Logs (vmkernel.log): The logs show repeated virt-reset errors, indicating that the storage subsystem is aborting I/O requests due to non-response.


     
    2026-01-15T13:34:02.493Z In(182) vmkernel: cpu81:2098397)Fil6: 4251: ioCtx: 0x45c9bdb4eb00, world: 2098397, overallStatus: IO was aborted by VMFS via a virt-reset on the device
    2026-01-15T15:06:50.409Z In(182) vmkernel: cpu187:28916551)Fil6: 4289: 'DataDriv

 

Environment

VMware vSphere ESXi 8.x

Cause

This issue is caused by Fibre Channel Congestion Spreading (often referred to as "Slow Drain" or "Credit Stall").

Even if the affected VM is not replicating across the ISL, it is impacted by switch-wide resource contention:

  1. Physical Failure: An ISL port (e.g., Port 15) experiences physical errors (bad SFP/cable), causing it to drop frames or fail to return buffer credits.

  2. Buffer Starvation: The switch's internal memory buffers for that link fill up because frames cannot be transmitted.

  3. Congestion Spreading: Because switch buffers are often shared resources, the lack of free memory "spreads" to other ports on the same switch, including those connected to local ESXi hosts.

  4. I/O Timeout: The ESXi host attempts to write data (disk consolidation), but the switch delays the request due to full buffers. The delay exceeds the SCSI timeout threshold (usually 30 seconds), causing the host to abort the command (virt-reset).

Resolution

To resolve the disk consolidation failure, the storage fabric congestion must be cleared.

Step 1: Identify the Faulty Link

Review the SAN Switch logs for ports showing high error counts (CRC, Frame Loss, or Credit Loss). In this scenario, the ISL port connecting the sites is the root cause.

Step 2: Isolate the Faulty Link (Workaround)

⚠️ WARNING: Verify redundancy before proceeding. If the faulty ISL is the only link between sites, disabling it will sever connectivity and stop all replication.

  1. Pause Replication: Suspend any active replication sessions on the storage array to prevent queueing.

  2. Disable Port: Administratively disable (shut down) the faulty ISL port on the switch.

    • Effect: This stops the accumulation of errors and clears the "Credit Starvation," removing backpressure from the switch immediately.

Step 3: Retry Consolidation

Once the port is disabled and the backpressure has cleared, retry the disk consolidation operation in vSphere. It should now complete successfully.

Step 4: Restore Redundancy (If applicable) or Schedule Repair

  • If redundancy exists, leave the bad link disabled until repair.

  • If no redundancy exists, re-enable the link to restore DR protection, but schedule a maintenance window to replace the physical components (SFP Transceivers and Fiber Optic Cable).

Additional Information