Poor disk performance for guest workloads on VSAN
search cancel

Poor disk performance for guest workloads on VSAN

book

Article ID: 417367

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Select VMs are showing higher then expected latency on the VM level and in the guest OS
The majority of other VMs are not showing the same latency and there is no evidence of vSAN storage latency. 

Environment

vSAN (All) 

Cause

VM workload is presenting very large I/O sizes up to 512k or larger
A review of this can be done using vSAN I/O Insight and this will show a large amount of I/O that is not 4k aligned and/or the larger I/O sizes generating the larger latency. 

To use vSAN I/O Insight

  1. Navigate to the vSAN cluster or host.
    You can also access I/O Insight from the VM. Select the VM and navigate to Monitor > vSAN > Performance > Virtual Disks.
  2. Click the Monitor tab.
  3. Under vSAN, select Performance.
  4. Select the I/O Insight tab and click New Instance.
  5. Select the required hosts or VMs that you want to monitor. You can also search for VMs.
  6. Click Next.
  7. Enter a name and select a duration.
  8. Click Next and review the instance information.
  9. Click Finish.
    I/O Insight instance monitors the selected VMs for the specified duration. However, you can stop an instance before completion of the duration that you specified.

    VMs monitored by I/O Insight must not be vMotioned. vMotion stops the VMs from being monitored and will result in an unsuccessful trace.

    vSAN displays performance charts for the VMs in the cluster, including IOPS, throughput, I/O size distribution, I/O latency distribution, and so on.

    Though I/O Insight you can see how much I/O is not 4k aligned and the latency based off the I/O size

Resolution

The IO size should be small (ideally 4K) for minimum latency and maximum IOPs. On the other hand, the IO size should be large (ideally 256K) for the goal of maximum throughput.

  • If your goal is to get maximum IOPs, you must choose a small IO size (ideally 4K). With a larger IO size, the system may be bottlenecked on the IO throughput that the vSAN disk group can handle; therefore, a smaller IO size ensures the system delivers the highest IOs per second.
  • If your goal is to get the minimum latency, you must choose a small IO size (ideally smaller than 64K). An IO request with IO size larger than 64K is split into multiple IO requests at the vSAN DOM Client layer; hence, there is additional latency in processing multiple requests. In the specific case where dedupe is enabled in an all-flash vSAN cluster, you must choose an IO size of 4K to achieve minimum latency because dedupe operates at the granularity of a 4K IO size.
  • If your goal is to get the maximum throughput, you must pick a large IO size (ideally 256K). With a larger IO size, the system may be bottlenecked on the IOs per second that a vSAN disk group can handle; therefore, a larger IO size ensures the system delivers the highest throughput at the expense of IOPS.

Please adjust the IO size according to the desired goal of performance evaluation.  Keep in mind that vSAN is optimized for 64K and smaller I/O sizes; configuring an I/O size >64K will result in increased guest I/O latency.

Additional Information

For additional guidance on I/O insight see the following:
Use vSAN I/O Insight
View vSAN I/O Insight Metrics

This is also related to the following KB and diagnostics report: 
vSAN performance diagnostics reports: "The size of IOs may not be optimal to achieve the desired goal"