CAUAJM_E_10656 The database <xxxxxx> has encountered a critical error.
search cancel

CAUAJM_E_10656 The database <xxxxxx> has encountered a critical error.

book

Article ID: 7148

calendar_today

Updated On:

Products

CA Workload Automation AE - Scheduler (AutoSys) Autosys Workload Automation

Issue/Introduction

Several errors like below ones are seen in the application server log when processing a long query for WCC

CAUAJM_E_18416 Event Server: <xxxxxx> Failed Query: <select jr.startime, jr.endtime, j.job_name, j.joid, jb.job_name, j.box_joid, j.as_group, j.as_applic, j.job_type, jr.run_machine, j.mach_name, j.owner, jr.status, i.timezone, jr.run_num, jr.ntry, jr.over_num, jr.exit_code, e.event, e.status, e.text, e.event_time_gmt, round((e.que_status_stamp-to_date('19700101','YYYYMMDD'))*24*60*60+(select int_val from ujo_alamode where type='gmt_offset')), j.description, j.job_ver, e.evt_num, js.status from ujo_job_runs jr inner join ujo_job ja on ja.joid=jr.joid and ja.job_ver=ja.job_ver and ja.is_active=1 and ja.is_currver=1 inner join ujo_job j on j.joid=ja.joid and j.job_ver=jr.job_ver and j.over_num=jr.over_num join ujo_job jb on jb.joid=j.box_joid and jb.job_ver=jb.job_ver and jb.over_num=-1 join ujo_sched_info i on i.joid=jr.joid and i.job_ver=jr.job_ver and i.over_num=jr.over_num join ujo_job_status js on js.joid=jr.joid left outer join ujo_proc_event e on e.event=e.event and e.status=e.status and e.joid=jr.joid and e.run_num=jr.run_num and e.ntry IN (0,jr.ntry) where (e.que_status = 2 and jr.endtime > 0 and (jr.endtime >= 1496860819 or e.event_time_gmt >= 1502038832) and jr.endtime >= 1496689060) union select jr.startime, jr.endtime, j.job_name, j.joid, jb.job_name, j.box_joid, j.as_group, j.as_applic, j.job_type, jr.run_machine, j.mach_name, j.owner, jr.status, i.timezone, jr.run_num, jr.ntry, jr.over_num, jr.exit_code, e.event, -1, e.text, e.event_time_gmt, 0, j.description, j.job_ver, e.evt_num, js.status from ujo_event e inner join ujo_job ja on ja.joid=e.joid and ja.is_active=1 and ja.is_currver=1 inner join ujo_job j on j.joid=ja.joid and j.job_ver=e.job_ver and j.over_num=-1 join ujo_job jb on jb.joid=j.box_joid and jb.job_ver=jb.job_ver and jb.over_num=-1 join ujo_sched_info i on i.joid = e.joid and i.job_ver=e.job_ver and i.over_num=j.over_num join ujo_job_status js on js.joid=e.joid left outer join ujo_job_runs jr on e.run_num = jr.run_num and e.joid=jr.joid and e.ntry IN (0,jr.ntry) where (e.que_status = 2 and jr.endtime > 0 and (jr.endtime >= 1496860819 or e.event_time_gmt >= 1502038832) and jr.endtime >= 1496689060) order by 4 ASC NULLS FIRST , 25 ASC NULLS FIRST , 15 ASC NULLS FIRST , 16 ASC NULLS FIRST , 22 ASC NULLS FIRST , 26 ASC NULLS FIRST > 

CAUAJM_E_18402 ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], [] 

CAUAJM_I_18403 Processing OCI function not used(4) 

CAUAJM_E_18400 An error has occurred while interfacing with ORACLE. 

CAUAJM_E_18401 Function <doExecute> invoked from <execute> failed <862> 

CAUAJM_W_10900 The database monitoring system has detected a potential problem with the database. 

CAUAJM_I_10901 The database monitoring system is beginning validation of database connections.

 

Same errors seen in the event_demon log:

CAUAJM_E_18416 Event Server: <AUTSYSP> Failed Query: <BEGIN :RetVal := ujo_batch_pkg.ujo_batch_get_event (:I_time, :RefCursor); END;> 

CAUAJM_E_18402 ORA-20870: First update of ujo_batch_get_event failed - -600 -ERROR- ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], [] 

ORA-06512: at "AEDBADMIN.UJO_BATCH_PKG", line 34 

ORA-06512: at line 1 

CAUAJM_I_18403 Processing OCI function ODEFIN(34) 

CAUAJM_E_18400 An error has occurred while interfacing with ORACLE. 

CAUAJM_E_18401 Function <doExecute> invoked from <bind> failed <555> 

CAUAJM_W_40207 An unexpected problem occurred while fetching events. Continuing..

Environment

CA WAAE 11.3.5 on Linux with Oracle database 11G2But it might happen with any other Autosys release

Cause

Oracle DBA ran this command and got same Oracle error

SQL> Analyze table AEDBADMIN.UJO_EVENT validate structure cascade online;

Analyze table AEDBADMIN.UJO_EVENT validate structure cascade online

*

ERROR at line 1:

ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [],

[], [], [], [], []

 

The error was caused by a table / index mismatch. 

There was an entry at the INDEX pointing to a row that did not exist at the table level. 

When you recreated the table using export / import, the index got recreated and it cleared the mismatch 

The root cause could be an ORACLE bug or a storage issue that allow the inconsistency to take place. 

CA suggest that you review the Note 285586.1 - ORA-600 [kdsgrp1] , it lists all the known ORACLE bugs. 

Also, upgrade the database to the most recent release of Oracle 11g, 11.2.0.4, and install all the latest Oracle patches

Resolution

DBA  exported, deleted and imported the table: 

- expdp system tables=AEDBADMIN.UJO_EVENT directory=DATA_PUMP_DIR dumpfile=autosys_20170612.dmp logfile=autosys_20170612.dmp.log

- impdp system tables=AEDBADMIN.UJO_EVENT directory=DATA_PUMP_DIR dumpfile=autosys_20170612.dmp logfile=impdp_autosys_20170612.dmp.log

 

After that, the error disappeared:

SQL> Analyze table AEDBADMIN.UJO_EVENT validate structure cascade online;

Table analyzed.

SQL> exit

Additional Information

Oracle DBA was working with Oracle support on that problem.

 

But before running expdp and impdp, CA would suggest to rebuild the indexes with:

perl reindexDB.pl

And finally, if it still does not fix this ORA error, run the Oracle expdp and impdp steps.