NSX Network Detection and Response - Number of messages in db.inserted_pcaps.xxxx: xxxxxxx exceeds threshold
search cancel

NSX Network Detection and Response - Number of messages in db.inserted_pcaps.xxxx: xxxxxxx exceeds threshold

book

Article ID: 333887

calendar_today

Updated On:

Products

VMware vDefend Network Detection and Response

Issue/Introduction

Symptoms:

  • lastline_test_appliance reported the below ERROR
> HARDWARE: OK
> NETWORK: OK
> SOFTWARE:
>  FAILURE: Number of messages in db.inserted_pcaps.xxxx: 51665 exceeds threshold: 10000
> Max total number of messages: 51665 exceeds threshold: 100000
Exiting with error-code 3

  • The Manager nodes RabbitMQ backlog was piling up when checked with the command "rabbitmqctl -p llq_v1 list_queues | grep -v -w 0$"
Listing queues
db.inserted_pcaps.xxxx 51665
  • Restarting docker service did not help with resolving the issue
  • db.inserted_pcaps kept increasing



Environment

NSX-NDR (Lastline) 

Resolution

To workaround the issue, we can increase the number of workers temporarily.

Create override.yaml file with an increased number of workers by running the below command. 

echo -e 'llupload::workers::inserted_pcaps::instances: 3' >> /etc/appliance-config/override.yaml

 

While this will help reducing the db.inserted_pcaps queue, this might increase the retry.X queues.

Check this with the command "rabbitmqctl -p llq_v1 list_queues | grep -v -w 0$"

Example output:

Listing queues

retry.1 41

retry.3 351

retry.4 740

retry.0 17

retry.2 110

retry.5 2267

If this is the situation, the problem might be coming from the PCAP analysis pipeline.

  • Check if the service is started: service uwsgi::app::pcapapi2-uwsgi status
  • Checked pcapapi2 logs : tail -n50 /var/log/pcapapi2/pcapapi_server.log

Log output as the following would indicate a problem with Suricata:

2023-08-22 08:15:29,065 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 19

2023-08-22 08:15:30,066 - pcapapi.plugin.suricata - ERROR - SURICATA: communication error (failure to connect to Suricata after 20 retries)

2023-08-22 08:15:30,066 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 0

2023-08-22 08:15:31,068 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 1

2023-08-22 08:15:32,069 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 2

2023-08-22 08:15:33,071 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 3

2023-08-22 08:15:34,072 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 4

2023-08-22 08:15:35,075 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 5

2023-08-22 08:15:36,076 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 6

2023-08-22 08:15:37,078 - suricata_uds_client.manager - WARNING - SURICATA socket broken pipe, retry 7

  • Check the status of the service: systemctl status suricata-lastline-unix-socket@suricata_0.service
  • Check Syslog file for entry matching suricata: grep -i 'suricata' /var/log/syslog | tail -n20

Log output as the following would validate the problem with Suricata:

Aug 22 09:13:10 lastlinemanager systemd[1]: Started suricata-lastline-unix-socket@suricata_0.service.

Aug 22 09:13:11 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Main process exited, code=exited, status=1/FAILURE

Aug 22 09:13:11 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Failed with result 'exit-code'.

Aug 22 09:13:12 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Service hold-off time over, scheduling restart.

Aug 22 09:13:12 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Scheduled restart job, restart counter is at 242549.

Aug 22 09:13:12 lastlinemanager systemd[1]: Stopped suricata-lastline-unix-socket@suricata_0.service.

Aug 22 09:13:12 lastlinemanager systemd[1]: Started suricata-lastline-unix-socket@suricata_0.service.

Aug 22 09:13:12 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Main process exited, code=exited, status=1/FAILURE

Aug 22 09:13:12 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Failed with result 'exit-code'.

Aug 22 09:13:13 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Service hold-off time over, scheduling restart.

Aug 22 09:13:13 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Scheduled restart job, restart counter is at 242550.

Aug 22 09:13:13 lastlinemanager systemd[1]: Stopped suricata-lastline-unix-socket@suricata_0.service.

Aug 22 09:13:13 lastlinemanager systemd[1]: Started suricata-lastline-unix-socket@suricata_0.service.

Aug 22 09:13:13 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Main process exited, code=exited, status=1/FAILURE

Aug 22 09:13:13 lastlinemanager systemd[1]: suricata-lastline-unix-socket@suricata_0.service: Failed with result 'exit-code'.

 

Manually running the suricata command line with added verbosity should indicate the problem:

# /usr/bin/suricata -vvvv -c /etc/suricata/suricata-lastline-unix-socket.yaml --unix-socket=/run/suricata_0/socket.sock --pidfile /run/suricata_0/suricata.pid --user=suricata &

22/8/2023 -- 14:09:05 - <Notice> - This is Suricata version 5.0.4 RELEASE running in SYSTEM mode

22/8/2023 -- 14:09:05 - <Info> - CPUs/cores online: 12

22/8/2023 -- 14:09:05 - <Config> - luajit states preallocated: 512

22/8/2023 -- 14:09:05 - <Info> - SSSE3 support not detected, disabling Hyperscan for MPM

22/8/2023 -- 14:09:05 - <Error> - [ERRCODE: SC_ERR_INVALID_YAML_CONF_ENTRY(139)] - Invalid spm algo supplied in the yaml conf file: "hs"

The error here indicates that the CPU is not equipped with the required SSSE3 CPU Flag that Suricata needs.