Event Drops in VMware Aria Operations for Logs due to Pending Queue Overload
search cancel

Event Drops in VMware Aria Operations for Logs due to Pending Queue Overload

book

Article ID: 381167

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

VMware Aria Operations for Logs is experiencing significant event drops when forwarding logs. The issue stems from the cluster's pending queue becoming overloaded, causing it to drop incoming events.

Reviewed the logs from the VMware Aria Operations for Logs and identified multiple instances where events were dropped due to the cluster's pending queue being full. Below are relevant log snippets that highlight the issue:

[2024-07-31 02:18:20.016+0000] ["PersistentNotification-thread-42"/XX.XX.XX.X INFO] [com.vmware.loginsight.daemon.notifications.PersistentNotificationQueue] [Sending notification '

{"AlertId":"Event Forwarder Events Dropped","Name":"Event Forwarder Events Dropped","Description":"VMware Aria Operations for Logs just dropped 203 events for forwarder target 'example.example.com', reason: Pending queue is full..\n\nThis message was generated by your VMware Aria Operations for Logs installation, visit the <a href='https://www.vmware.com/support/pubs/log-insight-pubs.html'>Documentation Center</a> for more information.","TriggerTime":"2024-07-31T02:18:19.659Z"}
', using notification provider 'com.vmware.loginsight.notifications.JsonLogNotificationProvider' attempt #1]

[2024-07-31 02:18:20.450+0000] ["ImportingThread-4"/XX.XX.XX.XX WARN] [com.vmware.loginsight.ingestion.forwarding.BaseForwarder] [Dropped 584 events for target example.example.com, reason: Pending queue is full. [5221 suppressed]]
[2024-07-31 02:18:50.452+0000] ["ImportingThread-3"/XX.XX.XX.XX WARN] [com.vmware.loginsight.ingestion.forwarding.BaseForwarder] [Dropped 72 events for target example.example.com, reason: Pending queue is full. [9154 suppressed]]
[2024-07-31 04:18:30.265+0000] ["ImportingThread-4"/XX.XX.XX.XX WARN] [com.vmware.loginsight.ingestion.forwarding.BaseForwarder] [Dropped 510 events for target example.example.com, reason: Pending queue is full. [7380 suppressed]]

Environment

VMware Aria Operations for Logs 8.x

Cause

The logs indicate that the "Pending queue is full" message is causing event drops. This suggests the cluster is unable to process incoming events at the expected rate, leading to a backlog in the pending queue and dropped events.

Resolution

There are three potential solutions to address the pending queue overload:

  1. Increase Worker Threads: The number of simultaneous outgoing connections to use. Normally higher worker count is needed for higher network latency to the forwarded destination and for higher events per second being forwarded. It can be increased (up to 512). How to increase the worker count (see step 4)

  2. Implement Forwarder Filtering (Experimental):  Filtering can be implemented on the log forwarder in the source cluster. This would discard unwanted events before sending them to the destination, reducing the overall load. However, be aware that filtered events will be lost.

  3. Scale out the Cluster: Adding another node to the destination cluster increases overall processing capacity and reduces the load on the individual nodes. Adding a node to the cluster

  4. Refining Forwarding Rules: Distributing the event load across multiple Log Forwarding rules reduces the processing burden on each individual rule, minimizing the risk of dropped events.

Additional Information

 

  • The provided log snippets confirm dropped events due to a full pending queue.
  • Consider latency as a potential factor contributing to the overload.