You have been having trouble scanning a few files, they are PDFs. you have scanned it on premise with the SharePoint connector with OCR enabled. You run the filter.exe against the file and it completes but the output file is blank. You scanned another document to make sure there wasn't anything wrong with the program itself and it returned an expected result.
Release : 15.8, 16.0
The OCR extraction method used by DLP can extract image content created by Acroforms:
PDF content created by other methods (e.g., "XFA") will not allow the DLP OCR engine to extract a readable image.
If the OCR engine finds no images at all it's either due to image quality and size requirements (see Image Quality and Resolution for OCR results (broadcom.com)).
However, in some cases the type of PDF involved will also prevent image extraction - e.g., "XFA" (XML Forms Architecture).
Thus, a form created by XFA might include the following document properties (viewed by Acrobat Reader "File > Properties" menu):
There is a Feature Request for this issue, PM-2963: "Support content extraction for XFA-based PDF forms".
This should be rectified in DLP 16.1 as content extraction for XFA should be supported.