Troubleshooting vSAN Latency and Performance Issues
search cancel

Troubleshooting vSAN Latency and Performance Issues

book

Article ID: 389082

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

VMware vSAN is a software-defined storage solution that pools storage from multiple hosts to create a shared, high-performance storage system. While vSAN improves scalability and flexibility, sometimes performance issues such as slow application response times or high latency can occur.

This article provides step-by-step troubleshooting for common vSAN latency and performance issues, helping users quickly identify and fix problems within the environment.

This article is for vSAN administrators and users who want to:

  • Identify what is causing vSAN slowness.
  • Apply common fixes to improve vSAN performance.
  • Understand best practices to prevent future issues.
  • Know when to engage with Broadcom Support.

Environment

VMware vSAN (All Versions)

Resolution

To troubleshoot a vSAN environment that is experiencing performance or latency related issue:

Step 1: Identify the Problem. Check for Performance Symptoms:

    1. Are your applications loading slower than usual?
    2. Are users complaining about delays when accessing files or databases?
    3. Are VMs taking longer to boot or process data?

Example Use Case: Long query execution times in a database server running on vSAN. Reports taking twice as long to generate.

Verify vSAN Latency in vCenter. To confirm the issue:

  1. Log in to vCenter Server.
  2. Navigate to Monitor > VSAN > Performance for the impacted cluster.
  3. Check these key performance indicators:
    • Read and Write Latency: Should be low (under 5ms for flash storage, under 20ms for hybrid storage).
    • IOPS (Input/Output Operations Per Second): Should be stable and not unexpectedly low.
    • Throughput (Data transfer rate): Should align with expected workload demands.

Example Use Case: In vCenter write latency is spiking to 50ms, which is much higher than expected.

Step 2: Check for Common Causes and Fixes

1. Network Issues (One of the most common causes of vSAN slowness)

Problem: vSAN depends on fast network communication. If the network is slow or experiencing packet loss, vSAN performance will suffer.

How to Check:

  1. Navigate to Monitor > vSAN > Performance > Physical Adapters and look for high packet drop rates.
  2. If using 10GbE or higher, ensure the vSAN traffic is using the right VLAN and Quality of Service (QoS) settings.

Quick Fixes:

  1. Restart network switches or check for configuration issues.
  2. Ensure all hosts have at least 10GbE connectivity.
  3. If using jumbo frames (MTU 9000), verify all switches support it.

Example Use Case: One of the vSAN hosts is connected to a 1GbE network switch instead of 10GbE, which is slowing down the entire cluster. Upgrading the network connection immediately improves performance.

2. Storage Policies Causing High Workload

Problem: If VMs use RAID-5/6 policies, performance may degrade due to extra processing.

How to Check:

  1. In the vCenter Server, navigate to Policies & Profiles > VM Storage Policies to see what policies are applied.

Quick Fixes:

  1. Change high-performance VMs to RAID-1 (Mirroring) for better speed.
  2. Avoid excessive resyncing by scheduling maintenance tasks during off-hours.

Example Use Case: A critical financial application sees high disk latency. After changing the storage policy from RAID-5 to RAID-1, the application performance doubles.

3. Overloaded Hosts or Disks

Problem: If a vSAN host is running out of free space, performance will degrade.

How to Check:

  1. Navigate to Monitor > vSAN > Capacity and ensure free space is above 30%.

Quick Fixes:

  1. Add more disks or expand cluster storage.
  2. Migrate VMs to less-utilized hosts to balance the load.

Example Use Case: A virtual desktop infrastructure (VDI) notices slow logins in the morning. Checking the vCenter Server, high disk utilization is seen. Adding additional storage fixes the issue.

4. Hardware and Firmware Issues

Problem: If storage controllers, SSDs, or NIC firmware are outdated, vSAN can slow down.

How to Check:

  1. Compare hardware versions in the Broadcom Compatibility Guide.

Quick Fixes:

  1. Upgrade network card and storage firmware to the latest versions.
  2. Ensure all vSAN hardware components are certified for use.

Example Use Case: Random vSAN latency spikes. Storage controller firmware is updated, and the problem disappears.

Step 3: Quick Fixes to Improve vSAN Performance

  1. Restart the vSAN Health Services. If vSAN performance metrics are not updating correctly, restart the vSAN Health in vCenter.
  2. Optimize Network Traffic. If vSAN shares bandwidth with other workloads, use Network I/O Control (NIOC) to prioritize storage traffic.
  3. Schedule Heavy Workloads Smartly. Run backups, indexing, and large data transfers during off-peak hours.
  4. Upgrade to vSAN ESA (Express Storage Architecture). If possible, migrate to vSAN ESA, which offers 2x-5x better performance than the older OSA model.

When to Contact VMware Support

If the issue persists after following these steps, collect the following information before creating a Broadcom case. For more information, see Creating and managing Broadcom support cases.

 

Additional Information