Does fragmentation affect VMFS datastores?
search cancel

Does fragmentation affect VMFS datastores?

book

Article ID: 340594

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Is fragmentation on a VMFS-3 / VMFS-5 a concern? Can a VMFS volume be defragmented?


Environment

VMware ESXi 3.5.x Embedded
VMware ESX 4.1.x
VMware ESX Server 3.0.x
VMware vSphere ESXi 5.1
VMware ESX 4.0.x
VMware ESXi 4.0.x Embedded
VMware vSphere ESXi 5.5
VMware ESX Server 2.5.x
VMware ESXi 4.0.x Installable
VMware ESXi 4.1.x Installable
VMware vSphere ESXi 5.0
VMware ESXi 4.1.x Embedded
VMware ESX Server 3.5.x

Resolution

Fragmentation is when blocks belonging to a file are scattered over the volume in a non-contiguous way, thus increasing disk seek and rotational latency.

A VMFS volume cannot be defragmented, but fragmentation is not relevant to VMFS performance for the following reasons:

  • Fragmentation causes performance degradation when I/O request from application spans multiple blocks and these blocks are discontiguous. The VMFS block size is so large that most of the I/O requests do not straddle block boundaries. Even if blocks are discontiguous, I/O requests execute to locally contiguous regions.
     
  • Virtual Disks are very large files. When a gap occurs, the gap is also large. Performance latency is most acute when the drive head(s) need to perform multiple seeks to assemble a file. In the case of a single gap, or very few gaps, between large sections, the seek time increase is negligible. This is especially true of pre-allocated (Thick provisioned) disks.
     
  • Disk arrays have huge caches, and most writes are absorbed there. It is very difficult for fragmentation to have a noticeable impact when it comes to SAN devices. Local storage may see more impact from fragmentation, because these disk caches are much smaller.
     
  • Sequential virtual machine streams become random on disk arrays as it is servicing I/O requests from multiple virtual machines on different hosts. That is, a virtual machine workload is highly concurrent due to many virtual machines running on the same datastore from the same host or from multiple hosts. Higher performance can be achieved by localizing the global storage working set (or hot blocks) on a given datastore, instead of co-locating the hot and cold blocks for a particular file.
Definitions
 
Hot block – A block of data that undergoes many changes. Areas a database write to are hot.
Cold block – A block that is mostly static. After the operation system is installed, these files change rarely if ever.
 
You can also use V-locity: Virtual Platform Optimizer for defragmentation of Windows guest virtual machines.
 

Additional Information

For translated versions of this article, see: