search cancel

Solaris Agent : Filetransfers stuck in "Connecting" status

book

Article ID: 248312

calendar_today

Updated On:

Products

CA Automic Workload Automation - Automation Engine

Issue/Introduction

Solaris Agent 12.2.10 is generating cores when is it submitting File transfer. The FT jobs remains in status "Connecting"
During the test period where traces where collected the agent generated 81 cores.

Job trace:

AIN-THREAD   20211223/082607.848 send_IPC_internal(type=CHANNEL_CLOSE,msg(100a05e50,msgID=15148,addr=100a061f0,len=2024,pos=0,flag=00000000)) -->
MAIN-THREAD   20211223/082607.848 U0009909 TRACE: (internal IPC message)     100a061f0 02024
                  00000000 4348414E 4E454C5F 434C4F53 45000000  >CHANNEL_CLOSE...<
                  00000010 2343434D 534F434B 30303030 30383835  >#CCMSOCK00000885<
                  00000020 00000000 00000000 00000007 00000000  >................<
                  00000030 0000004C 00000375 00000000 00000000  >...L...u........<
                  00000040 2A495043 28414745 4E542900 00000000  >*IPC(AGENT).....<
                  00000050= 00000000 00000000 00000000 00000000  >................<
                  000001A0 00000000 61C4248A 00000000 00000000  >....a.$.........<
                  000001B0= 00000000 00000000 00000000 00000000  >................<
                  000005C0 00000000 00000000 00040000 00004D58  >..............MX<
                  000005D0 00000000 00000000 00000000 00000000  >................<
                  000005E0 00000000 00000000 00000001 01D35FF0  >.............._.<
                  000005F0= 00000000 00000000 00000000 00000000  >................<
                  000007E0 00000000 00000000           >........<
MAIN-THREAD   20211223/082607.849 send_IPC_internal <-- (OK)
MAIN-THREAD   20211223/082607.849 ccm_channel_destroy:   closing socket = 76
MAIN-THREAD   20211223/082607.849 ccm_channel_destroy:   destroy queue lock = 10199b2f8
MAIN-THREAD   20211223/082607.849 ccm_channel_destroy <-- (destroyed)
MAIN-THREAD   20211223/082607.849 ccm_close(ccm=1007a1210) --

The core with returns with gdb output the following or similar:

### Solaris modular debugger
[email protected]:/var/cores/gf0zsxas169t,# file core_gf0zsxas169t_ucxju64_72836_72836_1640244490_20141
core_gf0zsxas169t_ucxju64_72836_72836_1640244490_20141: ELF 64-bit MSB core file SPARCV9 Version 1, from 'ucxju64'
[email protected]:/var/cores/gf0zsxas169t,# mdb core_gf0zsxas169t_ucxju64_72836_72836_1640244490_20141
Loading modules: [ libc.so.1 ld.so.1 ]
ucxju64:core> ::state
mdb: invalid command 'state': unknown dcmd name
ucxju64:core> ::status
debugging core file of ucxju64 (64-bit) from gf0zsxas169t
file: /opt/zones/gf0zsxas169t/root/opt/automic/Agents12.2.10/bin/ucxju64
initial argv: /opt/automic/ServiceManager12.2.10/bin/../../agent/bin/ucxju64 -i/opt/automic/S
threading model: raw lwps
status: process terminated by SIGBUS (Bus Error), addr=ffffffffffffffff
ucxju64:core> ::quit
[email protected]:/var/cores/gf0zsxas169t,#

Message sequence from a specific RunID

U00063085 FT '1142116769': File Transfer with partner 'GF0ZSXDB084T' started - sending.
U02003069 Connection 'GF0ZSXDB084T,(s=77,ID=860)' renamed to '*FTX(GF0ZSXDB084T,1142116769)'.
U00063094 FT '1142116769': Agent process 'FTX(1142116769)' with PID='27518' has been initiated.
U00063094 FT '1142116769': Agent process 'FTX(1142116769)' with PID='27518' has been initiated.
U00063095 FT '1142116769': Agent process 'FTX(1142116769)' with PID='27518' is up and running.
U00063095 FT '1142116769': Agent process 'FTX(1142116769)' with PID='27518' is up and running.
U00063087 FT '1142116769': Selection started with filter '/XFERT/home/xfer/CommonReportEngine/DATA/transfer/EHW/21RPTCE895QUENU20211214.XML' ...
U00063016 FT '1142116769': The file '/XFERT/home/xfer/CommonReportEngine/DATA/transfer/EHW/21RPTCE895QUENU20211214.XML' does not exist.
U00063089 FT '1142116769': Files selected: '1'.
U00063018 FT '1142116769': Cannot open file '/XFERT/home/xfer/CommonReportEngine/DATA/transfer/EHW/21RPTCE895QUENU20211214.XML'. Error: 'errno 2, No such file or directory'.
U00011409 FT '1142116769': File Transfer ended abnormally.
U02000007 Connection to Agent '*FTX(GF0ZSXDB084T,1142116769)(s=77,ID=860)' terminated.
U02002040 Disconnected from '*IPC(FTX,1142116769)' (socket handle = 's=79,ID=861').

Environment

Component : Solaris System Agent -- Versions 12.3.9 and 21.0.3 and previous service packs

Cause

This is bug of the Solaris Agent : An issue has been solved where the Solaris Agent threw core dumps occasionally during File transfer.

Resolution

This bug is fixed in the following releases:

Hotfix: 12.3.9 HF1 of the Solaris Agent - available.

Service Pack : 21.0.4 - pending.

Additional Information

A workaround is possible: consisting in activating a trace mode on this agent. This brings the agent in a stable status, the problem however is the danger of causing a file system full issue, because of the huge amount data that this mode introduces.

For example with these flags set on the agent process was running file:

- tcp/ip=4 ,- ft_debug=3 ,- memory=1