Our production master server is having a Java Crasherror after the following error:
Job 170194389 has a parent jobid of 170194386 however
could not be found in memory
What could be the reason for the error?
Release : 9.3x
Component : CA Automic Applications Manager
This error can be seen if there is a job in the backlog or history which cannot find its parent jobid, which causes the RmiServer to go into a loop and can possibly cause java memory to fill up
Applications Manager may stop running jobs and the above errors are seen in the logs after a system crash, network outage or running out of Java memory and the jobs may need to be removed from the so_job_queue_activity, so_job_queue and/or the so_job_history tables to correct the situation.
The jobs may need to be removed from the so_job_queue_activity, so_job_queue and/or the so_job_history tables.
Please log into SQL-Plus and run the following select statements. Replace <jobid> for the actual jobid referenced in the error message and send Automic support the results:
select count(distinct so_status_name), so_status_name from so_job_history
where so_status_name in('INITIATED','RUNNING','STAGED','STAGED_PW','STARTED','STARTING','QUEUE WAIT','PRED WT HOLD','PRED WAIT','LAUNCH ERROR','KILLING','DATE PENDING','CONDITN WAIT','AGENT WAIT')group by so_status_name, so_status order by 2;
select count(*) from so_job_queue where so_jobid=<jobid>;
select count(*) from so_job_history where so_jobid=<jobid>;
select count(*) from aw_job_queue_activity where so_jobid=<jobid>;