search cancel

OCR Unknown Document Format


Article ID: 242863


Updated On:


Data Loss Prevention Network Monitor and Prevent for Email and Web



I have a lot of incident that have an issue on the detection of file.

A lot of image file ( .png, .jpg, ;gif) are identified as Unknown Document Format


On investigation these images generally turn out to be very small images which are often found in email signatures.


These files are too small for the OCR server to perform reasonable OCR on.


All current version of DLP.


- This issue is known and will likely be fixed in a future release. However in the interim it may be possible to avoid the issue by increasing the minmum file size setting found in the file on the server.

ImagePreclassifier.OCR_MINIMUM_IMAGE_DIM which is normally set at 200 can be increased to 400 and higher in order to filter out small files.