It was found that some CAD files would fail to process and emails would fail to complete processing. Corresponding Email message IDs are not found in the logs, but the upstream MTA shows as sending the message.
Evidence was not found in the logs of file reader crashing on the .dwg file. But on subsequent files, it would show:
Class: com.vontu.messaging.chain.ContentExtractionChainObserver
Method: handleFailure
Level: WARNING
Message: Recording Message processing failure due to ContentExtractionTimeoutException
Class: com.vontu.messaging.chain.TimeoutHandler
Method: handleTimeout
Level: WARNING
Message: (DETECTION.12) Message chain #7 has exceeded the component timeout in Message Text Extractor. If it hasn't stopped processing in 30 more seconds this process will restart. Working on item RequestProcessor.2, total data length: 0
Class: com.vontu.messaging.chain.MessageChainPool
Method: stopChains
Level: WARNING
Message: Message chain #8 didn't stop processing the message.
To troubleshoot this we enabled in FileReaderLogging.properties
ApplicationMonitoringLogger.level = FINEST
And increased logging retention
Now we were able to see the CAD files come in and the message ID was visible in the logs.
And we saw
Method: processBinaryComponent
Level: WARNING
Message: Text extraction for <filename>.dwg failed, detecting on metadata only
Method: processMessageComponents
Level: WARNING
Message: Content extraction for file [<FileName>.dwg] from email [Message-ID: <MessageID@Company Domain> Date: Applicable date and time +0000] failed.
Release : 15.8 MP3
In this case file reader is not crashing perse, but it is being locked, once X number of threads become locked it cannot process subsequent files, and file reader restarts.
In this case, the thread was locking before we were able to log the event, Application Monitor Logging allowed us to log the process before it started content extraction.
After finally being able to identify what CAD file was causing the thread to lock, we were able to test with filter.exe and see that it would hang a never clear.
We used process explorer to see what was happening, and we were able to identify that there were many threads open and frozen by filter.exe, and we were also able to identify a subprocess in the filter.exe stack called OneAgent.
It turns out this is a monitoring tool from Dynatrace called OneAgent. once we cleared all threads, and removed that software filter.exe was able to process the file correctly.
Remove Dynatrace OneAgent from the detection server.