This article describes the changes in the behavior of default ESX IO Scheduler to handle IOPS Limit, when configured through storage policy.
Changes in ESXi 7.0.3 P09 (ESXi 7.0 Update 3q) and ESXi 8.0.3 (ESXi 8.0 Update 3)
Before 7.0.3 P09 and 8.0.3 U3, IOPs handling was done at IOfilter level and reservations/shares was handled by mClock IO scheduler which is the default IO scheduler for ESX. Default ESX IO Scheduler for handling IOPS Limit in Storage Policy was changed from iofilter to mclock in ESXi 7.0.3 P09 (ESXi 7.0 Update 3q) and ESXi 8.0.3 (ESXi 8.0 Update 3).
Moving the IOPs handling from iofilter to mClock IO scheduler has the following benefits:
IOfilter takes just IOPs into account and does not share the IOPs based on IO size which could lead to some issues. mClock disk IO scheduler takes IO size into account due to the fact that targets take varying times handling small and large IO sizes.
If IO size is > 32k, IO count is taken as (IO size/32k) which means the user might not see the configured IOPs, but a lower number due to this. When targets advertise performance numbers in terms of IOPs/bandwidth, they also mention the IO size for which these numbers apply that is because targets take different times in handling different IO sizes, i.e. longer time to complete larger IO size as compared to smaller IO size, say target time to complete 1 MB IO is greater than 4KB. Due to this if one VM is pushing 1M size IOs and other 4K, to be able to honour all 3 tuneables (reservation/shares/IOPs), mClock takes size into account.
In summary, mClocks is modeled to handle differently-sized IOs to be able to do fair IO scheduling across all VMs that can drive different IO sizes.
VMware vSphere 7.0
VMware vSphere 8.0
If user is interested only in IOPs but not reservation/shares and IO size > 32K, then to see the behavior is the same as in the case of using IOfilter without mclock. user can switch to SFQ IO Scheduler.
To enable SFQ IO Scheduler:
To check the current value of the configuration settings:
esxcli system settings advanced list -o /Disk/SchedulerWithReservation
Notes: default 1(mclock)
Set to 0(SFQ):
esxcli system settings advanced set -o /Disk/SchedulerWithReservation -i 0