Resolving AutoSys Agent "Missing" Status Due to Hostname Resolution Failures
search cancel

Resolving AutoSys Agent "Missing" Status Due to Hostname Resolution Failures

book

Article ID: 400573

calendar_today

Updated On:

Products

Autosys Workload Automation

Issue/Introduction

Multiple AutoSys agent machines may suddenly appear in a "missing" state and become unpingable from the AutoSys Scheduler or Application Server. While the agent services on these machines are confirmed to be running and accessible via their IP addresses (e.g., using telnet <IP_ADDRESS>:7520), attempts to connect or autoping using their hostnames fail. This situation indicates a problem with hostname resolution from the server to the agent.

This typically manifests with autoping errors similar to the following:

CAUAJM_I_50023 AutoPinging Machine [agent_hostname]
CAUAJM_E_10237 The hostname for the AutoSys Agent at [agent_hostname:7520] is either invalid or unreachable over the network
CAUAJM_E_50281 AutoPing from the Scheduler WAS NOT SUCCESSFUL.
CAUAJM_E_50007 Hostname [agent_hostname] is invalid or unreachable over the network.
CAUAJM_E_50283 AutoPing from the Application Server WAS NOT SUCCESSFUL.
CAUAJM_E_50026 ERROR: AutoPing WAS NOT SUCCESSFUL.

The core problem is the AutoSys server's inability to resolve the agent machine hostnames, often triggered by issues with DNS services or local name resolution configurations on the server, potentially following system events like server reboots that might affect DNS client settings.

Environment

Product: AutoSys Workload Automation

Components: AutoSys Agent, AutoSys Scheduler, AutoSys Application Server

Cause

The primary cause is a failure in hostname resolution on the AutoSys server(s) attempting to contact the agent machines. Perform the following steps to diagnose the issue and confirm if hostname resolution is the culprit:

 

  1. Verify Agent Service and Basic IP Connectivity:

    • On one of the affected agent machines, confirm that the AutoSys agent service is running.
    • From the AutoSys server (Scheduler or Application Server), verify direct IP connectivity to the agent's port (default 7520):
      telnet <AGENT_IP_ADDRESS> 7520
      Or using curl:
      curl -v telnet://<AGENT_IP_ADDRESS>:7520
      A successful connection here indicates the agent is listening and reachable by its IP address, but does not rule out hostname issues.
  2. Test Hostname Connectivity:

    • From the same AutoSys server, attempt to connect using the agent's hostname:
      telnet <AGENT_HOSTNAME> 7520
      Or using curl:
      curl -v telnet://<AGENT_HOSTNAME>:7520
    • Important: If this step fails while IP connectivity (Step 1) succeeds, it strongly indicates a hostname resolution problem on the AutoSys server.
  3. Check AutoSys autoping Utility:

    • Run autoping from the AutoSys server to the affected agent machine:
      autoping -m <AGENT_HOSTNAME>
    • Observe the error messages. If they match those detailed in the "Issue/Introduction" section, this further confirms the server's inability to reach the agent via its hostname due to resolution failures.

 

Resolution

If the diagnostic steps in the "Cause/Diagnosis" section confirm a hostname resolution problem, follow these steps to rectify it:

  1. Resolve Hostname Resolution Issues:
    The most common cause is an issue with DNS configuration on the AutoSys server or problems with the DNS infrastructure itself.

    • Option A: Correct DNS Configuration (Recommended Primary Solution)
      a. Engage your network or system administrators to investigate and rectify DNS resolution for the affected agent hostnames.
      b. Ensure the AutoSys server's DNS client configuration (e.g., /etc/resolv.conf on Linux/Unix systems) is correct and points to valid, functioning DNS servers.

    • Option B: Update Local /etc/hosts File (Temporary Fix or Specific Use Cases)
      a. As a temporary diagnostic step, or if DNS correction is delayed, or for specific local override requirements, you can add entries for the problematic agents to the /etc/hosts file on the AutoSys server(s).
      b. Open the /etc/hosts file using a text editor with root privileges (e.g., sudo vi /etc/hosts).
      c. Add entries in the format:
      <AGENT_IP_ADDRESS> <AGENT_HOSTNAME> [OPTIONAL_FQDN_ALIAS_IF_NEEDED]
      d. Save the /etc/hosts file. Changes are typically effective immediately without requiring a service restart.
      e. Important: Relying on /etc/hosts for numerous machines can become a significant maintenance challenge and is generally less scalable than a robust DNS solution. This should be used cautiously and ideally as a temporary measure.

  2. Verify Resolution in AutoSys:

    • After applying the fix (DNS or /etc/hosts), re-run the autoping command from the AutoSys server:
      autoping -m <AGENT_HOSTNAME>
    • The autoping command should now succeed.
    • The agent status in AutoSys monitoring tools (e.g., WCC, autorep -M) should change from "missing" to "online" after the system's next polling cycle or agent status check.