Malware Analysis Appliance (MAA) Yellow health state with error "Data insert queue size exceeding threshold"
The bulk-event-insert queue is the post-processing queue. Once a task finishes, MAA run post processing (matching results against patterns and extracting the data out of the intelliVM). If this queue is having a high value, this means either the post processing has stopped or it does not keep up with the load. The error "data insert queue size exceeding threshold" happens subsequently when the post-processing queue gets backed up. Different things can cause it - DNS issues, or over-driving the system and even unexpected load of file submissions.
1. Try to run these commands to fix this to restart the mq-consume-events:
[email protected]:/opt/mag2#sudo supervisorctl restart mq-consume-events:*
2. If the restart of mq-consume-events did not resolve, recommend to clear out the queue that filled up. Here's the procedure for clearing that queue:
i. Log into the device via SSH as the 'g2' user
ii. Check the currently open firewall ports. Defaults are SSH, HTTP, and HTTPS
[email protected]:~$ df-config-mgr --dump | grep external
network.external_ports=22/tcp, 80/tcp, 443/tcp
iii. Add the rabbitmq service to the open ports:
[email protected]:~$ df-config-mgr -w network.external_ports "22/tcp, 80/tcp,443/tcp, 55672/tcp"
iv. Open a browser and go to http://<maa>:55672/
User name 'guest' password 'guest'
This should display the RabbitMQ web GUI
v. Browse to the Queues tab
vi. Click on the bulk-event-insert queue
vii. Click on Delete/purge at or near the bottom of the page. Click on the Purge button on the right side of the page. A green 'Queue purged' message should display; click Close
viii. Close the browser window
ix. Remove the rabbitmq service from the open network ports
[email protected]:~$ df-config-mgr -w network.external_ports "22/tcp, 80/tcp,443/tcp"
x. Reboot the device
3. To avoid similar events, recommend to leave 'detailed event capture' disabled by default and monitor files submission from upstream (CAS/ASG and SA) task processing that would result the queue to be too high.