Unable to query vSAN health information. Check vSphere Client logs for details.
vsansystem.1: 2023-03-16T22:12:18.554Z info vsansystem[2101059] [vSAN@6876 sub=AdapterServer opId=9d387c7a-c110] Invoking 'queryVsanPerf' on 'vsan-performance-manager' session '52be4ad3-dcba-96ff-e5e7-089696dbd232' active 16
vsansystem.1: 2023-03-16T22:12:18.554Z verbose vsansystem[2101059] [vSAN@6876 sub=PyBackedMO opId=9d387c7a-c110] Enter vim.cluster.VsanPerformanceManager.queryVsanPerf, Pending: 17
vsansystem.1: 2023-03-16T22:23:29.904Z info vsansystem[2101059] [vSAN@6876 sub=PyBackedMO opId=9d387c7a-c110] Exit vim.cluster.VsanPerformanceManager.queryVsanPerf (671349 ms)
vsansystem.1: 2023-03-16T22:23:29.905Z warning vsansystem[2101059] [vSAN@6876 sub=IO.Connection opId=9d387c7a-c110] Failed to write buffer to stream; <io_obj p:0x0000001749f46fa8, h:124, <TCP '127.0.0.1 : 9096'>, <TCP '0.0.0.0 : 0'>> e: 32(Broken pipe), async: false, duration: 0msec
vsansystem.1: 2023-03-16T22:23:29.906Z error vsansystem[2101059] [vSAN@6876 sub=VsanSoapSvc.HTTPService opId=9d387c7a-c110] Failed to write to response stream; <<io_obj p:0x0000001749f46fa8, h:124, <TCP '127.0.0.1 : 9096'>, <TCP '0.0.0.0 : 0'>>, 52be4ad3-dcba-96ff-e5e7-089696dbd232>, N7Vmacore15SystemExceptionE(Broken pipe: The communication pipe/socket is explicitly closed by the remote service.)
vsansystem.1: 2023-03-16T22:23:29.907Z error vsansystem[2101059] [vSAN@6876 sub=AdapterServer opId=9d387c7a-c110] Failed to send response to the client: N7Vmacore11IOExceptionE(System exception while transmitting HTTP Response:
vsansystem.1: 2023-03-16T22:23:29.908Z info vsansystem[2101059] [vSAN@6876 sub=IO.Connection opId=9d387c7a-c110] Failed to shutdown socket; <io_obj p:0x0000001749f46fa8, h:124, <TCP '127.0.0.1 : 9096'>, <TCP '0.0.0.0 : 0'>>, e: 104(shutdown: Connection reset by peer)
2023-03-16T22:19:40.999Z warning vsanvcmgmtd[06161] [vSAN@6876 sub=Py2CppStub opId=9d38863a] Exit host-72859::vim.cluster.VsanPerformanceManager.queryNodeInformation (1115332 ms)
2023-03-16T22:19:41.101Z warning vsanvcmgmtd[02322] [vSAN@6876 sub=Py2CppStub opId=9d388672] Exit host-72859::vim.host.VsanSystemEx.queryHostStatusEx (1089715 ms)
2023-03-16T22:19:42.217Z warning vsanvcmgmtd[04609] [vSAN@6876 sub=Py2CppStub opId=9d388676] Exit host-72859::vim.host.VsanHealthSystem.getHclInfo (1085903 ms)
2023-03-16T22:19:42.221Z warning vsanvcmgmtd[04655] [vSAN@6876 sub=Py2CppStub opId=9d38870b] Exit host-72859::vim.host.VsanHostEventsProcessor.isEventQueueFull (1028006 ms)
2023-03-16T22:19:42.221Z warning vsanvcmgmtd[04661] [vSAN@6876 sub=Py2CppStub opId=9d38869b] Exit host-72859::vim.host.VsanHealthSystem.waitForVsanHealthGenerationIdChange (1083920 ms)
2023-03-16T22:19:43.429Z warning vsanvcmgmtd[04617] [vSAN@6876 sub=Py2CppStub opId=9d388710] Exit host-72859::vim.host.VsanHealthSystem.getHclInfo (1027108 ms)
2023-03-16T22:19:46.593Z warning vsanvcmgmtd[04604] [vSAN@6876 sub=Py2CppStub opId=SWI-48612b4b-871d] Exit host-72859::vim.cluster.VsanObjectSystem.queryObjectIdentities (1023049 ms)
2023-03-16T21:38:04.223Z ERROR vsan-mgmt[56740] [VsanClusterHealthSystemImpl::PerHostQueryObjectHealthSummary opID=noOpId] Error to query object health for host XXXX
Traceback (most recent call last):
File "bora/vsan/health/esx/pyMo/VsanClusterHealthSystemImpl.py", line 973, in PerHostQueryObjectHealthSummary
File "/usr/lib/vmware/site-packages/pyVmomi/VmomiSupport.py", line 595, in <lambda>
self.f(*(self.args + (obj,) + args), **kwargs)
File "/usr/lib/vmware/site-packages/pyVmomi/VmomiSupport.py", line 385, in _InvokeMethod
return self._stub.InvokeMethod(self, info, args)
VsanHealthThreadMgmt.TimeoutException
vsanmgmt.2: 2023-03-17T12:42:35.055Z info vsand[2693843] [opID=05fdde2e-a46f statsdb::QueryStats] table: VirtualMachine, startTime: 2023-03-17 12:00:34.732000+00:00, endTime: 2023-03-17 12:05:34.732000+00:00
vsanmgmt.2: 2023-03-17T12:42:35.144Z info vsand[2100901] [opID=05fdda6e-a328 statsdb::QueryStats] table: VirtualMachine, startTime: 2023-03-17 11:49:40.576000+00:00, endTime: 2023-03-17 11:54:40.576000+00:00
vsanmgmt.2: 2023-03-17T12:42:35.234Z info vsand[2101070] [opID=05fdda90-a35b statsdb::QueryStats] table: VirtualMachine, startTime: 2023-03-17 11:49:41.200000+00:00, endTime: 2023-03-17 11:54:41.200000+00:00
vsanmgmt.2: 2023-03-17T12:42:35.321Z info vsand[2101058] [opID=05fdda4c-a325 statsdb::QueryStats] table: VirtualMachine, startTime: 2023-03-17 11:49:38.274000+00:00, endTime: 2023-03-17 11:54:38.274000+00:00
From the log pattern, the above query is the VM query from the vROps.
The customer is running on the vCenter is: 8.3.0 (19375713) Edition: Advanced
From vCenter logs under /commands/lstool.txt, we find vROps is integrated.
Attributes:
Capabilities: VC-trusts
-------------------------------------------------------
Name: com.vmware.vrops.label
Description: com.vmware.vrops.summary
Service Product: com.vmware.cis
Service Type: com.vmware.vrops
Service ID: xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx_com.vmware.vrops
Site ID: default-first-site
Owner ID: [email protected]
Version: 6.7.0.000000
Endpoints:
Type: com.vmware.cis.common.resourcebundle
Protocol: https
URL: https://Hostname:443/catalog/com.vmware.vrops_catalog.zip
Endpoint Attributes:
com.vmware.cis.common.resourcebundle.basename: cis.vcextension.com_vmware_vrops.ResourceBundle
VMware vRealize Operations 8.3.x
VMware vSAN 7.x
VMware vSAN 8.x
It is not recommended to query metrics for multiple VMs in one API call, this is because in a large-scale setup, there will be thousands of VMs in a cluster. And 400 query specs for disk-group/cache-disk/capacity-disk. The slowness is caused by vROps vSAN adapter API calls.
Upgrade vROps to 8.10+ where we have few performance optimizations to mitigate the load on vSAN server while querying disk groups/disks from vSAN.
Workaround:
Option1:
In '/usr/lib/vmware-vcops/user/plugins/inbound/VirtualAndPhysicalSANAdapter3/conf/config.properties'
file change the value of "ENABLE_VM_DISCOVERY"
property 'false',
on EACH node(VM) of vROps cluster.
From vROps UI Stop/Start vSAN adapter instance which monitors the issued vSAN environment.
Please note that config.properties change will affect all vSAN adapter instances, if they are restarted.
Option2:
Note: Once upgraded to 8.10+, navigate to a particular vSAN adapter instance from Integrations, in Advanced settings, there will be an option to disable VM perf data collection. Please see the screenshot below
If the customer keeps running on a lower version of vROps then 8.10 version, they will need to disable vSAN VM discovery from the config property file and they will also miss the "Storage Policy compliance status" property.
However, if they upgrade to the vROps 8.10+ version, this may fix their issue, if not then they also had to disable VM perf data collection from vSAN adapter instance UI configuration and they will only miss one "Percentage of Consumers facing Disk Latency (%)" metric on vSAN Datastore object.
Impact/Risks:
Slowness while loading the vSAN option on the Cluster. Alert "Unable to query vSAN health information. Check vSphere Client logs for details." on vSphere UI page.