When there are too many concurrent queries at the same time, then search APIs fail intermittently if the OpenSearch/Elasticsearch query cache is occupying large portion of allocated heap.
Impact:
Failed to get the report - An unknown error has occurred
".Logs:
/var/log/search
.2024-03-26T06:57:35.900Z WARN http-nio-127.0.0.1-7440-exec-284 IndexingMetadataHelper 4492 - [nsx@6876 comp="nsx-manager" level="WARNING" reqId="68edf740-c06c-43a6-9894-11b2b80abd0a" subcomp="manager" username="[email protected]"] Could not fetch indexing position from ES, error: ElasticsearchStatusException[Elasticsearch exception [type=circuit_breaking_exception, reason=[parent] Data too large, data for [<http_request>] would be [1862171586/1.7gb], which is larger than the limit of [1860491673/1.7gb], real usage: [1862169944/1.7gb], new bytes reserved: [1642/1.6kb]]
VMware NSX
Due to frequent searches on any entity having number of entities greater than 10k the queries are getting cached. As a result, the heap taken by OpenSearch is significantly increasing.
This issue is resolved in VMware NSX 4.1.2.4
This issue is resolved in VMware NSX 4.2.0
Workaround:
Clear the Query cache of ElasticSearch/OpenSearch using command:curl -X POST "localhost:9200/_cache/clear?query=true"