Applications Manager processes may fail to start due with the error "AWAPI - Error: java awapi timed out". Review of RMI debug logs will show that RMI processing stops after the below aw_inc_update event. No other error is seen.
main: .DBAccess: getFunctionData() sID-507150 aw_web_api.aw_inc_update
main: .A: initStatement()
main: aw_web_api.aw_inc_update 0 maxSequence: IN:VARCHAR2:java.lang.String:0
Other behavior may include:
RMI may start processing after 5 or so minutes in which a 5 or so minute gap will be seen in the log
RMI may throw a Java Heap related error after 5 or more minutes
The awapi timeout error can be seen for caused for various reasons. It is recommended contact Broadcom Support to confirm issue prior to running delete/update statements.
Release: 9.3.x, 9.4.x, 9.5x
Component: Applications Manager
This issue has been reported when corrupt data is present in one or more of the following tables tables; so_job_queue, aw_job_queue_activity table, so_job_history.
CAUTION: All Application Manager Processes should be stopped prior to deleting any data from any of the AM tables. Prior to executing the steps below please log a support case to confirm issue.
Log into the machine where AM is installed as the AM OS user and execute the sosite file to set up the environment and then issue a "stopso all" command. Verify that all the AM processes have stopped prior to preceding. Once all the AM processes are no longer running you can delete the data as outlined below.
Although not required, it is advised to take a backup of the so_job_queue prior to deletion. To copy all the data currently in the so_job_queue table and store it in a temporary table use this SQL statement:
SQL> create table bu_so_job_queue as select * from so_job_queue;
Then you can delete all the rows in the so_job_queue table:
SQL> delete from so_job_queue;
SQL> commit;
Once the jobs have been cleared from the so_job_queue table, you should also remove all the data from aw_job_queue_activity table so there is no longer a reference to any of the jobs that you just deleted from the so_job_queue table.
SQL> truncate table aw_job_queue_activity;
SQL> commit;
As a manual delete of the so_job_queue was performed that status must be updated in the so_job_history table using sql below.
SQL> update so_job_history set so_status=32, so_status_name='FINISHED' where so_status_name in ('INITIATED','RUNNING','STAGED','STAGED_PW','QUEUE WAIT','PRED WT HOLD','PRED WAIT','KILLING','DATE PENDING','CONDITN WAIT','AGENT WAIT');
SQL> commit;
At this point you can restart all the Applications Manager processes if you want and then continue to evaluate the data in the bu_so_job_queue table to decide which jobs you want to restart.
Following are some useful SQL statements that can be used to continue to look at the information from the deleted so_job_queue table.
In order to see how many jobs were in each status you could run:
SQL> select so_status_name, count(so_status_name) from bu_so_job_queue
group by so_status_name;
It will display results similar to this:
SO_STATUS_NA COUNT(SO_STATUS_NAME)
------------ ---------------------
PRED WAIT 21
Skip!Active 1
STAGED 19
STG SKIP 31
ABORTED 2
STAGED_PW 120
CONDITN WAIT 1
INACTIVE 1
PW-SKIP 88
INITIATED 2
Skip!RunCal 1
11 rows selected.
If you want to list all the jobs that were in the so_job_queue table and sort them by the Process Flow they were in and then by jobid, you can run this SQL:
SQL> select so_module, so_job_seq, so_jobid, so_chain_seq, so_chain_id from bu_so_job_queue
order by so_chain_id, so_chain_seq, so_jobid;
Your results will be displayed similar to this:
SO_MODULE SO_JOB_SEQ SO_JOBID SO_CHAIN_SEQ SO_CHAIN_ID
------------------------------ ---------- ---------- ------------ -----------
XXX0324875_TEST7 797 5982 5982
SFTP-SIMPLE 33 5984.01 797 5982
SFTP_#2 33 5985 797 5982
SFTP_#1 33 5986 797 5982
SFTP_AGENT 33 5987 797 5982
SFTP_TOP_FLOW_ID 33 5988 797 5982
XXX0324875_TEST7_1 804 5989 5989
SFTP-SIMPLE 33 5991.01 804 5989
SFTP_#2 33 5992 804 5989
SFTP_#1 33 5993 804 5989
SFTP_AGENT 33 5994 804 5989
SFTP_TOP_FLOW_ID 33 5995 804 5989
TEST_JOB 20 5996
In the above sample it shows that there were 2 Process Flows (INC0324875_TEST7 and INC0324875_TEST7_1) and one job (TEST_JOB) in the backlog when the records were deleted. You can then decide which jobs you want to request again to run. If any of the jobs are required to run they must be request from the client gui.
Once you have Applications Manager back up and running jobs successfully and you have requested all the jobs you want to restart, you should drop the temporary so_job_queue table that you created by executing the following SQL statement:
SQL> drop table bu_so_job_queue;
SQL> commit;
It is highly recommended to back up the table prior to performing steps in this Knowledge Article.
Please consult your DBA before performing any delete / update statements and take needed backups.
You many also need to check and delete the records in the so_job_history table.