For DLP version 12.0 and below: Spaces in an EDM do not detect. When spaces are used with EDMs, false negatives can result.
Note: This is for version 12.0 and below. Version 12.5 introduces Multi-token EDMs.
When processing a document or message, text is broken into tokens. In most cases, one word becomes a single token.
When creating an EDM, if the data source has a value such as "United States," the value is indexed as one multi-word token. The processor is looking to match the whole string "United States”. On the detection side, if a message contains “United States” in non-tabular content, it is broken into two tokens, "United" and "States." The EDM token "United States," does not match "United" or "States" so it is not matched; hence, a false negative can result.
To improve EDM accuracy, when creating an EDM: