Guidelines for Provisioning Symantec Data Loss Prevention scans for File System Targets

book

Article ID: 170111

calendar_today

Updated On:

Products

Data Loss Prevention

Issue/Introduction

This article contains recommendations and guidelines for configuring Discover Servers to scan File System targets efficiently. You can use the attached spreadsheet to calculate the values of the recommended settings.

Resolution

Bear in mind the following factors while planning to configure Discover Servers for File System scan targets:

  • Scan throughput is affected by the available network bandwidth, number of CPU cores, and the total system memory of the participating Discover Servers.
  • Scan throughput is affected by the complexity of the configured policies.
  • A higher active user count on a particular File System server could reduce scan performance.
  • Scan performance is affected by the distances between the participating Discover Servers and the File System target being scanned.

Tunable parameters

Symantec recommends the following settings for each Discover Server. These settings can be changed in the crawler.properties file in on the Discover Server, located in <SymantecDLPinstalldirectory>\protect\config.

Please note on newer versions of Enforce these settings are modified in the Enforce console Located at: System > Servers and Detectors > Overview Server / Detector Detail - > Advanced Settings

  • crawler.threadpoolsize = 30 (default is 15)
    where crawler.threadpoolsize represents the number maximum number of crawler threads.
    Note: Use the recommended value only if your setup conforms to the recommended hardware configuration in the table below.

  • MessageChain.NumChains = 1 * No. of CPU cores (if hyper-threading can be verified, 2 * No. of cores)
    where MessageChain.NumChains represents the number of messages in parallel that the FileReader will process.

  • MessageChain.CacheSize = 2 * MessageChain.NumChains
    where MessageChain.CacheSize represents the size of the Detection (MessageChain) queue.

  • FileReader.MaxFileSystemCrawlerMemory = (Crawler Threads + MessageChain.NumChains + MessageChain.CacheSize) * FileReader.MaxFileSize
    where FileReader.MaxFileSystemCrawlerMemory represents the total runtime memory for all running threads.

  • BoxMonitor.FileReaderMemory = 4 * FileReader.MaxFileSystemCrawlerMemory
    where BoxMonitor.FileReaderMemory represents a dynamic memory pool holding all runtime data about the FileReader. This value should be less than the assigned system memory.

In addition, if you use the Grid Scanning feature, Symantec recommends configuring the following settings:

  • crawler.grid.follower.queuesize = 2 * crawler.threadpoolsize
    where crawler.grid.follower.queuesize represents the maximum number of files for detection that can be added to the grid queue.
     
  • crawler.grid.queuesize.multiplier = 4 * crawler.threadpoolsize
    where crawler.grid.queuesize.multiplier represents the grid scan request queue size per detection server.

You can use the attached spreadsheet to calculate the values for all of the recommended settings.

Note: The grid scanning feature is available in Symantec Data Loss Prevention version 15.0 and later.

Scan target configuration guidelines

Symantec recommends the following guidelines for configuring File System scan targets:

  • Scan mode guidelines:
  • When you select Grid as the scan mode, ensure that the tuning parameters, specifically for grid scanning, are configured on all of the Discover servers in the grid.
  • To configure a grid scan, you must select at least two Discover Servers.
  • To initialize a grid scan, at least two of the selected Discover Servers must be available.
  • Target configuration guidelines:
  • To avoid scanning unnecessary files, configure filters based on the expected items to be scanned on the basis of the File Type, Date Modified, and file size attributes.

Summary of configuration recommendations

Parameters Recommended Configuration (Single Server) Recommended Configuration (Grid Leader and 10 Discover servers)
Number of CPU cores 16 16
RAM (GB) 32 32
FileReaderMemory (GB) 16 16
FileReader.MaxFileSystemCrawlerMemory (MB) 2048 2048

crawler.threadpoolsize

30 30
MessageChain.NumChain 16 16
MessageChains.CacheSize 16 16
crawler.grid.follower.queuesize NA 60

crawler.grid.queuesize.multiplier

NA 120

For more information, refer to the grid scanning performance guidelines in the Symantec Data Loss Prevention 15.x Administration Guide.