Specific PDF files take longer then usual to scan
search cancel

Specific PDF files take longer then usual to scan

book

Article ID: 277171

calendar_today

Updated On:

Products

Protection Engine for Cloud Services

Issue/Introduction

An e-mail system PDF is printing/creating PDF documents that gets scanned by our Protection Engine. Some of the PDF files are taking over 8 seconds to scan. The file size is only a 130k file. The usual scan time for a file of this size takes less than a second to scan.

The java SDK is being used to stream the files via ICAP and are scanned during the process, but is having slow scan performance.

When sending the file using the command line scanner "ssecls.jar", it gave the same results.  

Even with an example file shared and tested within our lab servers was showing the same slow scan performance issue.

 

Environment

SPE 8.2.2

SPE 9.1.0

Cause

The applicable PDF file contains corrupted metadata.  

When attempting to open PDF file with Adobe Acrobat Pro, it subsequently gives the following error; “There was a problem reading this document (110)”.  

The Adobe PDF error shown (110) is specific to corrupted metadata. Creating a new PDF and copying the contents from an example PDF to a new one and saving it allows for it to scan in less than a second.


 

 

Resolution

Repair the corrupted metadata in the PDF file.

Additional Information

We initially suspected there was something in the PDF that was causing slow scan performance, but only were we able to determine what is was by opening the file with Adobe Acrobat Pro. Once the file is opened with Acrobat Pro, attempting to edit the PDF will cause Acrobat Pro to give the error (110). If the file is opened in a read-only mode, or if you are unable to modify the file, there wouldn't be any errors.

Eventually SPE times out and scans the PDF file, but a 8-9 second delay is seen initially.