The subsequent iteration of vSAN skyline health for performance service checks for the same cluster will show the following on the vCenter UI events page.
VMware vSAN 7.0.x
VMware vSAN 8.0.x
The health alert is generated because the vsan health check was unable to locate the vSAN performance stats object; however, the object is present because the subsequent health check iteration was successful, and when checked manually under vSAN Cluster> Configure> Services, we can see that the performance service is enabled, healthy, and compliant, as mentioned in the previous section.
The health check is unable to retrieve the performance stats object on some health check iterations due to the timeout defined for the API call responsible for the specific health check within the code.
2024-05-10T22:33:06.524Z ERROR vsan-mgmt[3244250] [VsanHealthThreadMgmt::join opID=noOpId] Not all tasks are finished with timeout 10
Traceback (most recent call last):
File "xx/xx/xx/xx/xx/xx.py", line 408, in join
File "/xx/xx/xx/xx/xx.py", line 241, in as_completed
raise TimeoutError(
concurrent.futures._base.TimeoutError: 4 (of 4) futures unfinished
.
2024-05-10T22:33:37.382Z INFO vsan-mgmt[766390] [VsanVcPerformanceManagerImpl::QueryClusterHealth opID=noOpId] QueryClusterHealth objInfo: (vim.cluster.VsanObjectInformation) {
directoryName = 'unknown'
}
++ The user can see that some calls take more than 10 seconds for the same thread (766390), resulting in a health alert on the vCenter events tab and the vSAN skyline health check. (The thread number may differ depending on the environment).
2024-05-10T22:33:37.368Z INFO vsan-mgmt[766390] [VsanPyVmomiProfiler::logProfile opID=noOpId] VsanVcObjectHelper.isMismatch: 11.39s, 11.41s, 4.48s, 4.47s, 4.50s, 4.51s
2024-05-10T22:33:40.222Z INFO vsan-mgmt[766390] [VsanHealthSummaryLogUtil::PrintHealthResult opID=noOpId] Cluster xxx Overall Health : yellow
Group cluster health : yellow
Test consistentconfig health : yellow
Issues: Host Disk Issue Recommendation
(Host-xxx, '', PerformanceServiceIsTurnedOnInClusterConfiguration,ButItIsNotEnabledYet., Auto-RemediationIsEnabled.See'AskVmware'ForMoreInformation.),
Group perfsvc health : yellow
Test perfsvcstatus health : yellow
Details: Result Status
(Yellow, PerformanceServiceIsDisabled)
This is a rare situation in which the API times out due to the extensive amount of clusters that a particular vCenter is responsible for managing.
https://knowledge.broadcom.com/external/article/326925/silencing-a-vsan-health-check.html