Workaround to reduce impact of resync traffic in vSAN ESA clusters utilizing a 10G network
search cancel

Workaround to reduce impact of resync traffic in vSAN ESA clusters utilizing a 10G network

book

Article ID: 372309

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

When vSAN ESA is running in environments with 10G networking infrastructure, under certain conditions resynchronization traffic (due to maintenance mode, capacity rebalance, policy change, or fault recovery) may cause an impact to VM IO traffic beyond the 20% target that Adaptive Resync attempts to enforce. Guest latency may increase as a result.

Environment

vSAN ESA 8.0U2 and newer

Cause

  • When 10G networking is used in conjunction with certain high throughput workload configurations, it is possible that specific parameters of the bandwidth sharing algorithm may not provide the same traffic sharing guarantees seen on higher throughput networks.
  • This affects all versions of ESA when configured on 10G networks. Support for 10G networking was only added in 80u2. There is no currently released version which contains an automated fix, so the below workaround is recommended.

Resolution

Setting a specific advanced configuration option can restore the scheduler's ability to balance resynchronization traffic and VM traffic fairly on 10G networks. The option /VSAN/DOMNetworkSchedulerThrottleComponent must be set to 1 on all ESX hosts in the cluster. This can be changed via the following esx cli command. 

esxcfg-advcfg -s 1 /VSAN/DOMNetworkSchedulerThrottleComponent

The config option is available in all versions of ESX which support ESA (namely ESX 8.0 onwards). The configuration option can be set on a running host. The effect will be immediate and persistent. Setting the config on hosts with networking faster than 10g will not cause functional issues but may result in suboptimal resynchronization throughput.