Post upgrade to Onprem 1371 or Hosted 1390, sensor throws Error "Malscape Analysis timeout" / high queue utilization for queue 'llshed_queue_papi_sensorupload_analyst_sdk

search cancel

Post upgrade to Onprem 1371 or Hosted 1390, sensor throws Error "Malscape Analysis timeout" / high queue utilization for queue 'llshed_queue_papi_sensorupload_analyst_sdk_processor'

book

Article ID: 375325

calendar_today

Updated On:

Products

VMware vDefend Network Detection and Response

Issue/Introduction

Post the upgrade Onprem 1371 or Hosted 1390, in the sensor either of this error/warning seen: Malscape Analysis timeout / high queue utilization for queue 'llshed_queue_papi_sensorupload_analyst_sdk_processor': with huge backlog of pending messages.

Environment

Sensor appliance Onprem version 1371 and Hosted version 1390 version has this issues

Cause

This is due to an identified bug:

The root cause is that Memcached is in a restart loop due to the known issue. The extra latency connecting to memcached causes some lock contention which leads to the errors.

By running the commands:

grep memcached /var/log/kern.log we can see the errors related to it in the output.
docker ps | grep memcahced would show memcached container is continuously restarting.
docker logs -f rapid_memcahced_# would indicate the issue

[P.S] The # in the above command 'rapid_memcahced_#' , indicate the particular apid_memcahced container number. From docker ps | grep memcahced command output, you will know this. For instance it'd be like rapid_memcahced_1

Resolution

Changing the entry to the version memcached:1.6.18 in the /usr/share/appliance-config/hierdata/location/onpremise.yaml by running the command: sed -i "s/memcached::image_version: '1.6.21'/memcached::image_version: '1.6.18'/g" /usr/share/appliance-config/hieradata/location/onpremise.yaml"
service-lastline rapid restart
Then re-trigger.

Additional Information

Fix is expected to be available in 10.0 release

Feedback

thumb_up Yes

thumb_down No