AutoSys / Automic Automation integration (Event Engine) stops working
search cancel

AutoSys / Automic Automation integration (Event Engine) stops working

book

Article ID: 234071

calendar_today

Updated On:

Products

CA Automic Workload Automation - Automation Engine CA Automic One Automation

Issue/Introduction

The integration from AutoSys to the Automic event engine (Analytics) stops working every couple of days. The IA agent stops and cannot be started again with the error below on the IA agent.


U000111113 task will be re-started as soon as host 'IA' is active again.

The error message in the kafka log files is:

WARN akka.remote.Remoting - Tried to associate with unreachable remote address [server:port]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters.
Reason: [The remote system has quarantined this system. No further associations to the remote system are possible until this system is restarted.]
WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp:server:port] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
WARN akka.remote.RemoteWatcher - Detected unreachable: [akka.tcp:server:port]
INFO org.apache.flink.runtime.jobmanager.JobManager - Task manager akka.tcp:server:port/user/taskmanager terminated. INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Kafka Event Queue for Client 100 (1/1) (ca54eb40e1a30cc6b297fcc14914b005) switched from RUNNING to FAILED.{code} so the address *server:port* (locally on the host itself) might be temporarily blocked - firewall etc, or the address resolving just stops to work.

Environment

Release : 2.x

Component : ANALYTICS ON PREMISE

Resolution

Local issue with the TCP/IP stack on the server running AE causing network connection to get blocked.

Most likely cause is firewall/router/load balancer or other active network device closes the session due to inactivity. This can only be diagnosed by network traffic analysis, and is out of Supports' scope.

Workaround:

  1. stopped the One Automation integration by editing in config.PRD file:
    OneAutomationEvents=0
  2. kill -HUP <event_demon pid>
  3. Stopped the Automic services (/etc/init.d/..)
  4. Make sure no java process is running (ps -aef | grep java), if anything is there please kill them
  5. Start the Automic services (/etc/init.d/..)
  6. Make sure all the components are started fine (check Service Manager client command)
  7. Enable the One Automation integration by editing in config.PRD file  

OneAutomationEvents=1

8. kill -HUP <even_demon pid>