VM Performance Degradation Due to Unremoved Datastore Backed by Slow FC Storage Device
search cancel

VM Performance Degradation Due to Unremoved Datastore Backed by Slow FC Storage Device

book

Article ID: 411299

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  1. Multiple virtual machines (VMs) residing in multiple VMFS datastore exhibit poor performance.

  2. Ping test to affected VMs fails intermittently.

  3. Accessing the VM console through vSphere Client is very slow.

  4. ESXi hosts report out of sync status in vSphere Client randomly.

  5. When logging in to ESXi via SSH and running esxtop, the following was observed:

    High latency at FC vmhba layer (KAVG/cmd 10ms~50ms).
    Low DAVG/cmd, indicating the array itself was not directly delaying requests.

  6. From vmkernel log file, there were thousands Error messages containing RESERVATION CONFLICT or Error status H:0x0 D:0x18 P:0x0

Environment

VMware vSphere ESXi 7.0, 8.0, and later

Cause

One datastore was backed by a faulty/slow backend storage device connected via FC HBA.

Although all VM data had been migrated away from the datastore, the datastore remained mounted on the ESXi cluster.

Since the datastore was still visible and accessible:

ESXi continued sending periodic VMFS heartbeat I/O operations.

These heartbeat I/O requests could not be completed in a timely manner due to the backend device’s poor performance.

As a result, I/O requests became stuck at the kernel (KAVG) level in the vmhba queue.

This led to overall vmhba performance degradation and impacted other VM I/O operations.

Resolution

  1. Migrate all data from the affected datastore to a healthy datastore.
  2. Unmount the datastore from all ESXi hosts in the cluster.
  3. Detach the LUN corresponding to the faulty storage device from all ESXi hosts.
  4. Verify that the datastore no longer appears under Mounted Datastores and that no further I/O is issued to the failed device.
  5. After unmounting and detaching the faulty datastore, the ESXi hosts should return to normal performance.

Additional Information

Leaving an unused but slow/unresponsive datastore mounted can significantly affect ESXi host performance.

Even if no VM is actively using the datastore, VMFS heartbeat traffic will continue, causing high kernel latency and degraded I/O performance across multiple VMs.