Password protected PDF not flagged as encrypted with DLP Network Prevent
search cancel

Password protected PDF not flagged as encrypted with DLP Network Prevent

book

Article ID: 400249

calendar_today

Updated On:

Products

Data Loss Prevention

Issue/Introduction

The DLP Network Prevent for Email Detection Server does not mark some PDF's as protected when running detection on PDF's that are not fully encrypted or password protected, only portions of their content are protected. The same file is correctly marked as protected when detection is run locally by the DLP Endpoint Agent.

Running the PDF through filter.exe reports the file as PasswordProtected.


Enabling CEH logging on the Email Detection Server shows the PDF is marked as not encrypted (isEncrypted: 0) by the ImageExtractorPlugin:

05/28/25 16:54:13 | WARN  | cehost | FileTypeIdentifierRequestExecutor [1276] | [2596] | Doing file type identification with: ImageExtractorPlugin | C:\Git\dlp-detection-core-native\ContentExtractionAPI\CEHost\FileTypeIdentifierRequestExecutor.cpp (200)
05/28/25 16:54:13 | WARN  | cehost | FileTypeIdentifierRequestExecutor [1276] | [2596] | ImageExtractorPlugin: identified the stream as: pdf, isEncrypted: 0 | C:\Git\dlp-detection-core-native\ContentExtractionAPI\CEHost\FileTypeIdentifierRequestExecutor.cpp (223)


FileReader logs show an image was extracted from the PDF during detection:

May 28, 2025 4:54:14 PM com.symantec.dlp.imagepreclassifier.NativeImagePreclassificationProvider shouldPerformOcr
INFO: [8824] Image id image_extractor_plugin_embedded_image with size #### from message id ##### classified as colorType: RGB_IMAGE_DARK C:\VontuDev\workDir\ImagePreclassifier\ImagePreclassifier\ImagePropertyProviderImpl.cpp 230

Environment

Symantec Data Loss Prevention 16.x
OCR 

Cause

When OCR is enabled on the Detection Server, ImageExtractorPlugin will run first when evaluating file types. If an image is detected and extracted from the non-encrypted portions of the PDF, ImageExtractorPlugin will flag the entire file as not encrypted. If no images are detected, then the file will run through normal detection and be flagged as protected.

The PDF is correctly flagged as a protected/encrypted file with Endpoint Agent detection because OCR does not run on endpoints.

Resolution

Working as designed