VMware vSAN 8.0.x
VMware vSAN 7.0.x
The vSAN disk capacity has reached the Threshold in the scenario of 1 host failure.
In addition to the basic limit health check, there is also a simulation of how resources would look like after an ESXi host failure has occurred. If a single ESXi host fails, two things can happen. First, the resources on that ESXi host (such as cache and capacity) are no longer available. Second, vSAN attempts to re-protect (rebuild) all components belonging to objects that are now currently running with reduced redundancy due to the failure.
This health check simulates both actions described above. If the ESXi host with the most resources consumed fails, this health check calculates how much resources would be used from the remaining hosts in the cluster, and how much resources would still be available.
Note: If there is already a failure in the cluster, this test will report on one additional failure. Therefore, this test reports on the results of the current failure and the additional failure that it introduces.
In vSphere 6.7 Update 3 and later releases the Health check name is updated to "Capacity Utilization"
Q: What does it mean when it is in an error state?
If this check reports that after a host failure, more than 100% of resources will be used, it means that re-protection fails for some objects because there are not enough resources available.
Note: This health check simulation is very simple. It only looks at cluster aggregate resources, so just like the basic limits check, it does not consider the distribution and placement rules.
However, this simple simulation will verify that, after a failure, a vSAN cluster has been configured with enough resources to operate in an operationally safe manner after a re-protection. This test does not check for balance and fault domains, so these needs to be considered independently of this test.
For example, a user may enforce an operational business policy to have no less than 25% free disk space under normal conditions and no less than 15% free disk space after one failure. This check can be used to implement such a policy and to verify that this is indeed the case.