After some Oracle Database Deadlock issues which led to multiple DB sessions having to be killed, the OWP process seems to be always busy, displaying 100% utilization all the time (B01, B10 and B60).
A Cold-Start was performed while starting the Automation Engine a while ago while being on a version inferior to 12.3.4
Nothing appears in the OWP log that could explain why it's continuously busy.
There are just a few records in the MQOWP table (less than 10) and they seem to be processed quickly enough.
The load seems to derive from a specific client, on this case it's Client 150 according to the Chart/Table view in Processes and Utilization:
How to investigate in order to find out what is causing this high utilization of the OWP process?
This issue is caused by the ColdStart of the system that removed the JPEND message from the MQWP table so the Workflow could not be deactivated and the OWP would loop trying to process it continuously.
Release : 12.x
Component : AUTOMATION ENGINE
In order to troubleshoot this kind of issues we need to do the following:
1. Enable the traces tcpip=2 and database=4 on the impacted WP process via AWI for two/three minutes, then set them again to 0.
2. Perform an analysis on the OWP associated trace log file WPsrv_trc_XXX_00.txt with a Text editor like RS File Viewer (rsview) that is available in the tools/no_supp folder of the AE image.
a. Look for the string RCV and count the number of lines and see which is the one that appears most the time.
On this case, out of 7567 RCV lines, 7215 were like the following one:
20200901/125229.256 - JPEXEC_R RCV DEACT frm UC4D#WP003 MQWP MsgID: 1092364862 c-acv: 00000000
b. Do a search that will print the next fifth line as that will be the associated Runid that OWP tries to Deactivate.
Click on Edit - Select Lines and paste the exact string from above: "RCV DEACT" and Successor=5 (to also select the 5th line that appears after the DEACT message)
c. Then click on Display selection and you should find the duplicated Runids (EH_AH_Idnr) that keep appearing all the time, on this case 816263749 and 816310224:
3. To remove these runIDs and break the loop there are 2 options:
This is explained on the following article, for which you should contact Broadcom Technical Support to provide the necessary statements.
Once one of these options is applied the OWP process utilization will start to consume what is left in the OWP queue and utilization will eventually drop to 0.
Update to a fix version listed below or a newer version if available.
Automation.Engine 12.1.9 - Available
Automation.Engine 12.2.7 - Available
Automation.Engine 12.3.4 - Available
Not all cases of a looping OWP are solved by the upgrade.