What to check if event agent on AIX is very slow to start up or shut down.
search cancel

What to check if event agent on AIX is very slow to start up or shut down.

book

Article ID: 22811

calendar_today

Updated On:

Products

Compress Data Compression for MVS Compress Data Compression for Fujitsu CA Unicenter NSM

Issue/Introduction

The name resolution behavior of AIX systems can impact the start-up and shut-down time of OPR on this type of system.

 

Environment

Release:
Component: UNIEM

Resolution

Enterprise Management components use nslookup function to obtain the fully qualified host name, which is used in a command that registers the current state of the EM components whenever they start up or shut down.

If the nslookup utility is not installed, then error messages indicating that the utility was not found are generated each time the EM components are started and stopped. In addition, the NSM utilities that register the current state of these components will fail because the argument list provided to the utilities (cauwvset and cauwvstat) do not contain a valid hostname.

On AIX, a DNS query will wait a total of 75 seconds before failing over to the second nameserver entry in the resolv.conf.

This can have a direct impact on the start-up time of a NSM agent running on NSM, especially if first nameserver is down or not responding.

The 75 second timeout comes from two values RES_RETRY and RES_TIMEOUT. The way this works is the following:

DNS client will try to contact the DNS server. If it does not get an answer, it will wait 5 seconds and try again, then it will wait 10 seconds and try again, 20 seconds and try again, 40 seconds and try again. That makes: 5 + 10 + 20 + 40 = 75 seconds.

This behavior can be changed by doing the following:

# export RES_RETRY=value
# export RES_TIMEOUT=value

The timeout is defined by two values:

       RES_TIMEOUT (default 5 seconds)
       RES_RETRY (default 4 times)
       Each retry the initial timeout value is multiplied by 2 .
       Example:
       5 + 10 + 20 + 40 = 75 <--- default timeout .
       Example:
       export RES_TIMEOUT=10
       export RES_RETRY=2
       10 + 20 = 30 seconds timeout

In order for these to remain in effect after a reboot you need to edit /etc/environment and add these values in the file.

Or you can export the two variables in the /etc/profile.CA file.