Uproc aborts with errror Unable to connect

book

Article ID: 109305

calendar_today

Updated On:

Products

CA Automic Dollar Universe

Issue/Introduction

Uprocs randomly end with status Aborted. The only message found in the Job Log is  "Unable to connect". The usual Uproc header containing the Uproc, MU, variables and other information related to the execution is not printed. 

This issue is likely to happen in a period where there are many Uprocs launched at the same moment. 

Cause

The program uxjobinit that is called from u_batch is momentarily unable to connect to the IO in time

Environment

Release: ADUNAS99000-6.0-Automic Dollar Universe-AS
Component:

Resolution

Workaround:
Increase Node Settings > Technical Settings > "Time-out for IO server (seconds)" to 60
This corresponds to the variable U_IO_TIMEOUT in values.xml.

Since this issue often happens at a particular time where there is a peak charge on the node you can also consider setting a maximum number of parallel job in DQM, this would smooth out the charge on the server and prevent incidents.
Go to Design > Environment > Batch Queues > Update Queue > set "Maximum Job Limit" to 30 (or 40 or 50 depending on the sizing of the node)