Can Vontu detect on incomplete rows in an EDM?

book

Article ID: 160789

calendar_today

Updated On:

Products

Data Loss Prevention Endpoint Prevent Data Loss Prevention Network Monitor Data Loss Prevention Network Prevent for Email Data Loss Prevention Enforce Data Loss Prevention Network Discover Data Loss Prevention Network Prevent for Web Data Loss Prevention Network Protect Data Loss Prevention for Tablets Data Loss Prevention Endpoint Discover

Issue/Introduction

Can SymantecDLP detect on an Exact Data Match (EDM) with incomplete data where some of the fields are not filled in? How can one be sure that the data will be detected?

Resolution

By default, SymantecDLP will still process rows with incomplete or incorrect data.

The setting for this is located in the ../config/Indexer.properties file:

# Whether to ignore bad rows from the data profile while creating an SDP.
# Rows are considered bad if they have more columns than expected or a column data type is incorrect.
# Set to true if bad rows should be ignored.
drop_bad_rows=false

If errors constitute a certain percentage of the data in the profile (five percent, by default), SymantecDLP stops indexing and displays an error to warn you that your data may be disorganized or corrupt.  
NOTE: Vontu DLP checks for errors only if the data source has at least a thousand rows.

Missing values (a.k.a. "row too short" errors) do not count toward either "bad rows" or the 5% threshold.  There are only two kinds of errors that count: 

- "row too long", e.g. a row has more fields than expected
- "invalid field format", e.g. an invalid SSN

The following value can be changed in the Enforce UI when creating the Exact Data Profile:

Error Threshold (max % of rows with errors before indexing stops)

The default is 5%, but changing this value to 100% will remove the check for bad rows.  There are no performance issues involved with making this change.