[VMC on AWS] Storage Sensitive Application Best Practices
search cancel

[VMC on AWS] Storage Sensitive Application Best Practices

book

Article ID: 329824

calendar_today

Updated On:

Products

VMware Cloud on AWS

Issue/Introduction

Detail known best practices which reduce impact to storage sensitive applications when running in VMware Cloud.

Symptoms:
Customer is experiencing intermittent or otherwise unexpected storage performance degradation when running an IOPS intensive or storage sensitive application.

Resolution

VM Configuration

  • Thin provisioning disks is the advised option to take for all Workload VM VMDKs as there is no performance impact when using Thin Provisioned disks while maintaining a higher rate of storage utilization efficiency.
  • Using a RAID-1 storage policy will always provide the best storage throughput at the lowest latency. RAID 5/6 are known to cause storage performance impacts for write intensive workloads such as a busy Microsoft SQL Server.
  • Replicating VMDKs between AZs in a stretched cluster SDDC has a performance impact. However, this design may be required by the customer to ensure application resiliency.
  • With regards to CPU configuration, select 1 socket for anything under 18 CPU cores on i3.metal hosts. Optimize the VM by selecting 2 sockets for anything with 19 CPU cores or more on i3.metal.
    • Select 1 socket for anything under 24 CPU cores on i3en.metal hosts. Switch to 2 sockets for 25 CPU cores or more on i3en.metal.
  • Disable CPU Hot Plug
  • Do no run any production VMs with a snapshot chain in place. Consolidate or delete all snapshots when available.
  • Split monolithic VMDKs into multiple smaller VMDKs.
  • Split SQL Server disks into different vSCSI Controllers
  • For Windows VMs, enable RSS per KB 2008925
  • Upgrade to the latest VMware Tools Version
  • Upgrade to the highest possible VMware Hardware Compatibility level that is supported by the customer's use-case.
  • Microsoft SQL Servers are heavily impacted if not IOPS are not aligned on 4K boundaries. Refer to KB 83134 & KB 83163 on how to configure this properly.

Troubleshooting Perspective

As VMware Cloud runs on top of AWS hardware, keep in mind that storage performance values from an OnPrem infrastructure cannot be compared to that against a VMC SDDC. Further to the point, having different RAID policies or SDDC versions between two test VMs makes it near impossible to truly compare performance values. To ensure we have an apples to apples comparison, attempt to remove as many variables from the mix as possible. I.E. same Guest OS Version, VM Configuration, VM tools version, overall SDDC version, etc…

Copy & Paste between two workload VMs will never be an accurate vSAN Throughput/Latency test. To properly test VM Disk speed in VMC, please use diskspd with the following parameters. It will create a 100GB file and execute an I/O test. -w0 is 100% read. -w100 is 100% write:
  • For latency: diskspd.exe -d900 -b4K -o32 -w0 -W60 -r -Sh -L -D -c100G c:\path\to\file.dat
  • For throughput: diskspd.exe -d900 -b256K -o2 -w0 -W60 -r -Sh -L -D -c100G c:\path\to\file.dat


Additional Information

Optimize VM Configuration for SQL Server Workload VM in VMC
Performance Characterization of Microsoft SQL Server Using VMware Cloud on AWS
ARCHITECTING MICROSOFT SQL SERVER ON VMWARE VSPHERE

Impact/Risks:
Making changes to a production VM, especially if running any sort of Database, can have unexpected consequences. It is advised to make these changes during a maintenance window or outside of production hours.