All jobs are in Time overrun, Event or Launch wait status
search cancel

All jobs are in Time overrun, Event or Launch wait status

book

Article ID: 85942

calendar_today

Updated On:

Products

CA Automic Dollar Universe

Issue/Introduction

Affects Release version(s): 5

Error Message :
The universe.log file contains the following types of errors:

####################################
<< 2015-03-03 23:34:17 0023982/uxioserv /io_error /000000000 - /orsyp/universe/exp/data/u_fmcx50.dta
<< 2015-03-03 23:34:17 0023982/uxioserv /u_maj_fichier_th /000000010 - ERROR : cannot write in file /orsyp/universe/exp/data/u_fmcx50.dta
<< 2015-03-03 23:34:17 0023982/uxioserv /io_error /000000000 - /orsyp/universe/exp/data/u_fmhs50.dta
<< 2015-03-03 23:34:17 0023982/uxioserv /u_maj_fichier_th /000000010 - ERROR : cannot write in file /orsyp/universe/exp/data/u_fmhs50.dta
<< 2015-03-03 23:34:18 0023982/uxioserv /io_error /000000000 - /orsyp/universe/exp/data/u_fmph50.dta
<< 2015-03-03 23:34:18 0023982/uxioserv /u_maj_fichier_th /000000010 - ERROR : cannot write in file /orsyp/universe/exp/data/u_fmph50.dta
<< 2015-03-03 23:34:23 0023982/uxioserv /io_error /000000000 - /orsyp/universe/exp/data/u_fmfu50.dta
<< 2015-03-03 23:34:23 0023982/uxioserv /u_maj_fichier_th /000000010 - ERROR : cannot write in file /orsyp/universe/exp/data/u_fmfu50.dta

##############

The universe.log file also shows a segmentation fault:

<< 2015-03-03 23:34:58 0023982/uxioserv /UXOS_HdlTermProcess /000000000 - execution handler : SIGNAL = (11) PID = (23982) PPID = (1) GPID = (23982) 

Patch level detected:Dollar Universe 5.6
Product Version: Dollar.Universe 5.6.0 FX25010

Description :No job is running. They are all in Event or Launch wait and Time overrun status.
A newly submitted job goes directly into Launch wait status and the log shows it is waiting in the queue.

IO engine is not running.

Environment

OS: All Unix

Cause

Cause type:
Configuration
Root Cause: The "ERROR : cannot write in file" message is usually caused by either incorrect permissions, no access to the file system, or a full disk/file system

Resolution

Check that the disk is not full and that the correct access has been granted for the data files.
If the data files cannot be written to, the IO engine will stop.

#####################

The core dump / "SIGNAL = (11)" error should have generated a core file in the exec folder, which can then be read by the client, using the "gdb" command, to further narrow down the root cause.

#####################

Make sure to shutdown the node, make an offline reorganization and restart after a verification of the user rights and the file system size.

Fix Status: No Fix
 

Additional Information

Workaround :
N/A