UTF8 characters not displaying correctly when written back to DLP Enforce customer attributes from a script lookup plugin

book

Article ID: 242006

calendar_today

Updated On:

Products

Data Loss Prevention Core Package

Issue/Introduction

When using the a script lookup plugin on a Windows DLP Enforce server any UTF8 characters will not display correctly when written back to DLP custom attributes. Instead of seeing the correctly encoded UTF8 characters, the characters will show up as gibberish. This applies to any UTF8 characters including Chinese, Japanese, Korean, Arabic etc.

The screenshot below from an incident snapshot page in DLP Enforce shows an example of a Python script lookup plugin taking the value from the testattachment custom attribute field and writing it back to the Script-attribute field with the UTF8 characters now garbled.

Cause

DLP uses the Windows command line to call custom scripts when Enforce is installed on Windows. Because of this, the character encoding that is returned from the script is limited to the default Windows OS encoding for the command prompt, which is not UTF8. Any UTF8 characters sent from the script and sent back to DLP will appear incorrectly as they are set to the command prompt encoding.

Environment

Windows Enforce Server

Resolution

Use Enforce Server on Linux, whose default character encoding is UTF8 which resolves the issue, as all characters are natively UTF8.

Attachments