DLP Best practices for using Data and Document profiles
search cancel

DLP Best practices for using Data and Document profiles

book

Article ID: 174457

calendar_today

Updated On:

Products

Data Loss Prevention Data Loss Prevention Enforce

Issue/Introduction

Symantec Data Loss Prevention (DLP)

Exact Data Matching (EDM) is the most accurate form of detection. EDMs are also the most complex to set up and maintain. To ensure that your EDM policies are as accurate as possible, consider the recommendations in this document when you implement your EDM profiles and policies.

Indexed Document Matching (IDM) is designed to protect document content and images. IDM relies on an index of fingerprinted documents to perform partial and derivative text-based content matching. In addition, you can also use IDM to match indexed documents exactly, based on their binary stamp. Including not only text-based documents but also graphics and media files.

Due to the broad range of matching supported by IDM, consider the best practices in this document to implement the IDM policies that accurately match the data you want to protect.

Environment

  •  

Resolution

All Data Profile Best Practices

  • Re-Index document profiles after a DLP upgrade
  • Review estimated profiles sizes on the document profile pages and ensure enough native memory is available on the relevant detection server(s)


EDM/EMDI Best Practices:

  • Ensure that the data source file contains at least one column of unique data.
  • Eliminate duplicate rows and blank columns before indexing.
  • To reduce false positives, avoid single characters, quotes, abbreviations, numeric fields with fewer than five digits, and dates
  • Understand multi-token indexing and clean up as necessary..
  • Map data source column to system fields to leverage validation during indexing.
  • Leverage EDM policy templates whenever possible.
  • Include the column headers as the first row of the original data source file.
  • Check the system alerts to tune Exact Data Profiles.
  • Use stop words to exclude common words from matching.
  • Automate profile updates with scheduled indexing.
  • Match on two or three columns in an EDM rule.
  • Leverage exceptions to avoid false positives.
  • Use a where clause to detect the records that meet a specific criterion.
  • Use the minimum matches field to fine-tune EDM rules.
  • Consider using Data Identifiers in combination with EDM rules.
  • Include an email address field in the Exact Data Profile for profiled DGM.

IDM policy best practices:

  • Do not compress any documents that contain content you want to fingerprint.
  • Prefer partial matching over exact matching on the DLP Agent.
  • Only index the text-based documents that have content.
  • Be aware of the limitations of exact matching
  • Use white listing to exclude partial file contents from matching and reduce false positives.
  • Filter non-critical documents from indexing to reduce false positives.
  • Use remote indexing for large document sets.
  • Create separate collections for each set of documents over 1,000,000 files, with all files in their unencapsulated, uncompressed state
  • Set up separate IDM profiles if indexing more than 1,000,000 documents. You can change the index max size per IDM profile to index more than 1,000,000 documents. However, it is usually less resource-intensive to set up separate profiles.
  • Use scheduled indexing to automate profile updates.
  • Use multiple IDM rules in parallel to establish and tune match thresholds.

Data and Document Profiles in the Cloud Best Practices:

Best Practices for these Profiles in the cloud are the same as on-premises detection servers. With Cloud Detectors, however, all two-tier indexes must be free from any errors, at least for the first profile that is uploaded to a new Detector. This also applies to Active Directory indexes, which are stored as an EDM Profile.