Discover Performance Troubleshooting

book

Article ID: 160137

calendar_today

Updated On:

Products

Data Loss Prevention Network Discover

Issue/Introduction

My Discover Scan is slower than I would expect. How can I determine where the bottleneck is?

Resolution

In general, the performance bottleneck for Discover scans is the transfer to the Discover machine.  Linux Discover processes are 1/3 slower than Windows. 

  1. Setup your target
  2. Check your policy groups.  Are there any empty policy groups?  This will affect Discover performance.
  3. Check the network latency between the Discover machine and the target by running the following for at least 30 minutes:
    ping -t -l 8192 <file server name> > latency.txt
    • If the latency (average) is lower than 18ms goto step 4.
    • If it is higher you are probably limited by network issues. Check NIC settings and make sure they are compatible with the relevant switch settings.  Check number of hops and move Discover closer to target if possible.
  4. Run the scan for 10 minutes, keeping an eye on CPU usage on the Discover machine. Then pause the scan. Calculate the scan rate in GB/Day using the information on the Scan Detail page.
    • If it is at least 250 GB/Day then all is good.  Resume the scan. 
    • If it is much less than 250 GB/Day go to step 5.
  5. Was the CPU usage on the Discover machine close to 100% during the 10 minutes of scanning? 
    • If so, you are probably limited by cracking.  Check to see if the target has a lot of XLS and/or PDF files.  If this is the case, there is not much that can be done except add more hardware. 
    • If CPU was not maxed out, go to step 6.
  6. Manually copy 2GB worth of representative files from the target to the Discover machine and calculate the transfer rate.  (It is important to copy the actual files that you will be scanning as the average file size has a big impact on transfer rates.) 
    • If this rate is slower than 300 GB/Day, then you are probably limited by network issues.  Check NIC settings and make sure they are compatible with the relevant switch settings.  Check number of hops and move Discover closer to target if possible. 
    • If the rate is greater than 300 GB/Day go to step 7.
  7. Turn off collection of ACL and Owner information.  In Crawler.properties, set filesystemcrawler.collectaclandowner = false.  Restart the Discover monitor.  Run the 10 minute test again. 
    • If this solves the problem, then you need to weigh the value of scan rate vs security meta-data.  Collecting the extra meta-data takes time and there probably isn't anything that can be done to speed it up.