DUAS6: Uprocs getting Aborted "Could not submit" randomly
search cancel

DUAS6: Uprocs getting Aborted "Could not submit" randomly


Article ID: 186184


Updated On:


CA Automic Dollar Universe


Some Uprocs, fail to be Submitted so they end in status Aborted, with the action Icon "Could not Submit".
The history trace is available.
The Job Log is created, but belongs to the user root and is empty.
A temporary file containing the environment execution variables remains present in the folder $U_TMP_PATH

When relaunching the Uproc, it usually works fine, launching it 5 minutes later, works fine as well.

Nothing can be found on /var/log/secure or /var/log/messages that explain why the "su" command fails to switch from root to the submission system user.



Release : 6.x
OS: Linux/Unix


Unknown system issue, should be related to the su command failing to switch the user context from root to the submission account user.


Workaround 1:

1. Assign the node to the user that will be the only one that can submit Jobs (in this case bjob).

To do so, launch the following as root:

./uxrights -m assign -a bjob
./uxrights -m restrict
su - bjob
cd /path_of_du
. ./unienv.ksh
cd bin

2. Modify the submission account administrator to point to the user bjob instead of the current one ( in this case duadmin), and any other account that pointed to the problematic user.

Workaround 2:

If the issue only affects some Uprocs that run at a given time, ie every day at 00:05 120 Uprocs are launched and out of these, 2 or 3 may Abort with Could Not Submit.

In this case, the most simple solution is the change the scheduling time of the different Tasks and/or add additional random sleep in the Uproc scripts or U_POST_UPROC to avoid a lot of multiple submissions at the exact same second.

Else, increase the ulimit -n and ulimit -u parameters of the impacted user to a higher value than the default 1024, this can be done in the .profile of the user or in /etc/security/limits.conf