Special Characters in an EDM

book

Article ID: 159949

calendar_today

Updated On:

Products

Data Loss Prevention Network Monitor

Issue/Introduction

Do we treat special characters differently with a DCM than with an EDM.  How would an EDM match [email protected]#SSN*&^?

Resolution

All non-asian content is tokenized using non-alphanumeric characters.  This is the same for EDM or DCM. Asian content for an EDM tokenization is controlled by the indexed content: A sequence of asian characters is considered a token if the same sequence of characters was encountered in a database cell during indexing.

If you have a mix of alphanumeric and non-alphanumeric characters in a cell: eg “AAA BBB” or if you have a mix of asian and non asian characters in the same cell, then the contents of that cell will not be matched in a message.


This is a limitation of EDM in order to process double byte characters.

For an EDM, [email protected]#SSN*&^ will yield SSN as a token to match on.