Troubleshooting CAUAJM_W_00043 and CAUAJM_E_00045 messages
search cancel

Troubleshooting CAUAJM_W_00043 and CAUAJM_E_00045 messages

book

Article ID: 38224

calendar_today

Updated On:

Products

Autosys Workload Automation

Issue/Introduction

In certain cases, when an AutoSys client command is attempted, below errors are seen:

CAUAJM_W_00043 The UNIX function getaddrinfo did not respond in a timely fashion for hostname [<AutoSysServer.example.com>:$]. Retrying...
CAUAJM_E_00045 Exceeded wait time for the UNIX function getaddrinfo. Hostname [<AutoSysServer.example.com>:$] was not resolved.
CAUAJM_E_10684 Unable to establish a network connection to [<AutoSysServer.example.com>:9,000].
CAUAJM_E_10221 Exhausted list of available Application Server(s). Failing request.

In other cases, the Scheduler log contains  messages like:

CAUAJM_W_00043 The UNIX function getaddrinfo did not respond in a timely fashion for hostname [<agent-hotname.example.com>]. Retrying....

 

Resolution

  • When the Scheduler or Application Server need to resolve a hostname, they do so by making a call to the getaddrinfo() UNIX function. When the system call is initiated, there is a default timeout of 15 seconds set for it to return a result. If the result is not returned within the timeout period, the CAUAJM_W_00043 message is posted to the log of the component making the call. At that point, a second call to getaddrinfo() is made. If the second attempt also times out, the machine is placed offline. The machine will be automatically placed online once communication with it is restored.

  • When there are many occurrences of this warning message in the Scheduler log, it is a good indication that there are intermittent communication issues between the Scheduler machine and a DNS server and should be investigated by your system administrator and/or network team.

  • Environments that have IP caching disabled are more likely to see these warnings since the call is made every time a hostname needs resolution.

  • The timeout interval is configurable using an environment variable that is read by the Scheduler/Application Server on startup. This is accomplished by setting the variable (AS_RESOLVEHOST_TIMEOUT)  in the $AUTOUSER/autosys.sh.<hostname> file, which is read by the startup scripts for both components.

AS_RESOLVEHOST_TIMEOUT=60; export AS_RESOLVEHOST_TIMEOUT

NOTE: The value set for this variable can be from 1 to 120 and is added to the default 15 second timeout. For example, the value is set to 20, the timeout period is increased from 15 to 35 seconds. 

  • Increasing the timeout will reduce the frequency of the warning messages. However, it does not address the underlying issue with getaddrinfo performance. That should still be investigated by the system admins/network admins. Slow performance of this function within an environment that does not have IP caching enabled will cause a significant performance hit to the instance.

  • If these errors are happening on the AutoSys client side programs (example: Client attempting to connect to Server, but cannot), then ensure that the client host can resolve the server hostname mentioned. Adjusting resolv.conf or /etc/hosts files could also help alleviate such issues