About regular expressions in Data Loss Prevention

book

Article ID: 160373

calendar_today

Updated On:

Products

Data Loss Prevention Enforce

Issue/Introduction

Learn which regular expressions can be used when configuring Symantec Data Loss Prevention (DLP).

Resolution

Although implemented with different libraries, Endpoint regex has been standardized to conform with the purpose, function, and design of regex for DLP servers.

The following demonstrates what abilities are usable:

Note: This is not an exhaustive list of regular expressoins; it is only meant to demonstrate what DLP will handle.

Regular expression Description
(?i) (?-i) case insensitive
\b word boundary
[a-z] lowercase alphabet
[A-Z] uppercase alphabet
[%*.#$%@-] symbols intended to be matched (must be in square brackets)
(?<=regex) lookbehind
(?=regex) positive Lookahead
(?!regex) negative Lookahead
\D non-digit character
[ ] white space can be used for a space instead of \s
\d digit character
\w word character
\W non-word character
^ zero-width character indicating line will start with
$ zero-width character indicating line will end with
+ 1 or more matches
- range as in A-Z
\ escape, can be used in examples where a character should be matched not seen as a match pattern such as \.\*\+\?

Limitations

Symbols must be in square brackets in order to be matched on.  Symbols .*| are not supported for data identifier patterns.  \w does not match _.  \s cannot be used to match whitespace, please use whitespace character.

 

Note: Symantec Support does not provide support for the creation of custom content filters or regular expressions, but does recommend the following resources.

Regular expression (regex or regexp for short) is a special text string for describing a search pattern.  Information about regex is available in the online help from the Edit Rules page (see text below