Unable to detect the Keywords in xml documents.
search cancel

Unable to detect the Keywords in xml documents.

book

Article ID: 198295

calendar_today

Updated On:

Products

Data Loss Prevention Endpoint Suite Data Loss Prevention Data Loss Prevention Network Discover Data Loss Prevention Network Email Data Loss Prevention Network Monitor and Prevent for Email Data Loss Prevention Network Monitor Data Loss Prevention Network Monitor and Prevent for Email and Web Data Loss Prevention Network Monitor and Prevent for Web Data Loss Prevention Network Prevent for Email Data Loss Prevention Network Prevent for Email Virtual Appliance Data Loss Prevention Network Prevent for Web Virtual Appliance Data Loss Prevention Enforce Data Loss Prevention Enterprise Suite Data Loss Prevention Endpoint Prevent Data Loss Prevention Endpoint Discover Data Loss Prevention Discover Suite Data Loss Prevention Core Package Data Loss Prevention Network Protect Data Loss Prevention Network Web

Issue/Introduction

The use case is to detect the content in the XML files which have markup in them.

Environment

Release : 15.8, 16.0, 16.1

Component : Endpoint Prevent, Network Prevent/Monitor/Discover

Cause

The content extraction on XML file is bypassed in normal situation. To enable the content extraction, it needs to be enabled in the Agent Advanced Settings.

Resolution

Steps to enable the setting: 

  • Login to Enforce Server Console with admin rights
  • Open Agent Configuration > Advanced Setting.
  • Locate the setting "Detection.MARKUP_AS_TEXT.str"
  • Turn this setting "ON". Default is off.
  • Save Agent Configuration
  • Apply Agent Configuration to Endpoint Machines.

Note: This setting is also available in the Detection Server advanced settings. If the detection is taking place on server (Network Prevent/Monitor/Discover), enable this setting in the server advanced settings. 

 

Additional Information

NOTE: with "Markup as text" detection enabled, the DLP detection engine may no longer correctly recognize the character set of the XML (or another markup type) file. This can have an effect if the file charset is non-Latin/not UTF-8. This will lead to incorrect text being extracted from the file. At the moment, there is no workaround for this.