vSAN Health Service - Online health (Cluster) - vSAN Cluster Compliance
search cancel

vSAN Health Service - Online health (Cluster) - vSAN Cluster Compliance

book

Article ID: 388455

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

This article introduces the new vSAN Health Service check 'vSAN Cluster Compliance' first introduced in vSAN 8.0U3 and provides details on why it may report an error and how to fix the error state.

This Health Service check for 8.x can be found under the Online Health section, as of 9.x this has been moved to the Cluster section.

Environment

VMware vSAN 8.0U3 and higher

 

Resolution

Q: What does the 'vSAN cluster compliance check' check do? 

This check validates that the storage configurations are compliant with the configured vSAN cluster for optimal storage availability and performance. Specifically, it will verify the following configurations.

  • For vSAN MAX cluster with less than 6 nodes
    • The number of vSAN hosts in the cluster should be more than 3
    • The total number of disks on each host should be more than 1 but less than 7 or 9 for 10G or 25G vSAN network infrastructure respectively. Deploying two or more disks in each host can improve performance, however, cluster deployed with more than 6 disks, where storage policy is FTT=1 (R5), increases the risk of secondary failure in an event the object(s) needs to be rebuilt on another host
    • Total claimed disk capacity on each hosts should be less than 30.72TB or 46.08TB for 10G or 25G vSAN network infrastructure respectively. Otherwise, for objects using a storage policy of FTT=1 (R5), the time taken to regain the prescribed level of resilience during a failure may exceed the desired requirements.
  • For vSAN ESA cluster with 10G network, the vSAN advanced option "/VSAN/DOMNetworkSchedulerThrottleComponent" should be set. When vSAN ESA is running in environments with 10G networking infrastructure, under certain conditions resynchronization traffic (due to maintenance mode, capacity rebalance, policy change, or fault recovery) may cause an impact to VM IO traffic beyond the 20% target that Adaptive Resync attempts to enforce. Guest latency may increase as a result. Setting a specific advanced configuration option can restore the scheduler's ability to balance resynchronization traffic and VM traffic fairly on 10G networks. Refer to KB Workaround to reduce impact of resync traffic in vSAN ESA clusters utilizing a 10G network for more detail. Besides, the vSAN cluster capacity utilization should always keep under 75% full for vSAN ESA cluster with 10G network.

Q: What does it mean when it is in an error state?

If this check fails, it means the storage configurations are not compliant with the vSAN cluster, storage availability and performance of the cluster cannot be guaranteed.

Q: How does one troubleshoot and fix the error state?

Please follow the guideline as below to fix each of compliance issue.

  1. For the compliance issue of less than 4 hosts in the cluster, please add more hosts into the cluster.
  2. For the compliance issue of number disks on some hosts exceeding (or less) than that of recommendation, please either add or remove some disks to satisfy the disk number requirement and make sure there is homogeneous storage configuration on each host.
  3. For the compliance issue of total claimed disk capacity exceeding than that of recommendation, please remove or replace some disks to satisfy the disk capacity requirement and make sure there is homogeneous storage configuration on each host.
  4. For the compliance issue of vSAN advanced option "/VSAN/DOMNetworkSchedulerThrottleComponent" not setting on some hosts for vSAN ESA cluster with 10G network, please set the vSAN advanced option on all hosts with the ESXi command line "esxcli system settings advanced set -i 1 -o /VSAN/DOMNetworkSchedulerThrottleComponent"