WP crashes with Signal 11 after upgrading to v21 with after TIMER_DO01 during Events with old Agents
search cancel

WP crashes with Signal 11 after upgrading to v21 with after TIMER_DO01 during Events with old Agents

book

Article ID: 267836

calendar_today

Updated On:

Products

CA Automic One Automation CA Automic Workload Automation - Automation Engine

Issue/Introduction

After upgrading from 12.x to 21.x, as soon as some Events start to be processed by old Agents (11.x or 12.x) by the v21 "P processes, the WP processes crash with Signal 11 every time after TIMER_DO01, EVENT:

20230420/114608.275 -  Signal 11 (SEGV) at 0x2e with code 1
20230420/114608.275 -  U00009907 Memory view 'Parameter List ' (Address='0x7fffa36cf620', Length='1296')
20230420/114608.275 -            00000000  25255352 30310400 00000000 20202020  >%%SR01......    <
20230420/114608.275 -            00000010= 20202020 20202020 20202020 20202020  >                <
20230420/114608.275 -            00000400  20202020 20202020 20202020 00000100  >            ....<
20230420/114608.275 -            00000410  20000000 E8730200 E08E6200 48081F00  > ...?s..??b.H...<
20230420/114608.275 -            00000420  2A534552 56455220 20202020 20202020  >*SERVER         <
20230420/114608.275 -            00000430  20202020 20202020 20202020 20202020  >                <
20230420/114608.275 -            00000440  4A504558 45435F52 00000000 20202020  >JPEXEC_R....    <
20230420/114608.275 -            00000450= 20202020 20202020 20202020 20202020  >                <
20230420/114608.275 -            00000470  20202020 00000000 50524F44 23575030  >    ....PROD#WP0<
20230420/114608.275 -            00000480  32382020 20202020 20202020 20202020  >28              <
20230420/114608.275 -            00000490  20202020 20202020 4D515750 20202020  >        MQWP    <
20230420/114608.275 -            000004A0  2CD6791E 00000000 20202020 20202020  >,Fy.....        <
20230420/114608.275 -            000004B0= 20202020 20202020 20202020 20202020  >                <
20230420/114608.275 -            00000500  20202020 20202020 01000000 00000000  >        ........<
20230420/114608.275 -  U00009907 Memory view 'Input Memory' (Address='0x16f9730', Length='32')
20230420/114608.275 -            00000000  54494D45 525F444F 30310000 00000000  >TIMER_DO01......<
20230420/114608.275 -            00000010  4556454E 54202020 32409565 DE2A0300  >EVENT   2@•e?*..<

In all cases, the stack trace and the functions calls are the same:

20230508/134350.491 - Call stack:
20230508/134350.491 - -----------
20230508/134350.491 - fs-event
20230508/134350.491 - check-event
20230508/134350.491 - timer-event
20230508/134350.491 - v-timer-do
20230508/134350.491 - exec-sr
20230508/134350.491 - JPEXEC_R
20230508/134350.491 -  
20230508/134350.491 - Call history:
20230508/134350.491 - -------------
20230508/134350.491 - fehlerbehandlung-lastmsg
20230508/134350.491 - ft-login-objekt-prufung
20230508/134350.491 - ucuhost-to-host
20230508/134350.491 - host-check
20230508/134350.491 - meld-fehler
20230508/134350.491 - USER-KICK-EH
20230508/134350.491 - fs-event
20230508/134350.491 - db-fehler
20230508/134350.491 - check-event
20230508/134350.491 - loggen
20230508/134350.491 - memory-leak-detection
20230508/134350.491 - db-fehler
20230508/134350.491 - db-clst
20230508/134350.491 - db-fehler
20230508/134350.491 - kal-fehler
20230508/134350.491 - validation-period-pruefen
20230508/134350.491 - memory-leak-detection
20230508/134350.491 - db-fehler
20230508/134350.491 - db-clst
20230508/134350.491 - db-fehler
20230508/134350.491 - db-fehler
20230508/134350.491 - read-eh-by-idnr-for-update

Excerpt from a WP trace file with tcpip=2 and db=4 traces:

20230508/134350.421 - INSERT INTO MQMEM (MQMEM_PK, MQMEM_System, MQMEM_Title, MQMEM_Len, MQMEM_Content, MQMEM_Version, MQMEM_MQSet) VALUES (?, ?, ?, ?, ?, ?, ?)
20230508/134350.421 - UCUDB32 INSR RET 0000 HSTMT: 0x00000008768a40 VALUE: 0x00000000000001 ALL:  0.00027 DB:  0.00018 ODBC:  0.00000 UDB:  0.00008
20230508/134350.421 -   STRT UCUHOST        OPC: 0010  ucuhost-name: YYYY vers=2
20230508/134350.421 -   EXIT UCUHOST        RET: 0000000000 TIME: 0000,00002 RETTEXT='' 
20230508/134350.422 - ActionSend(msgsize=327, SOCKET(s=11,name=UC4IN2#WP003,type=04,host=,add=xxx.xxx.xxx.xxx,port=2461,id=0,netarea=UC4IN2,roles=,nxt=0x153f520)) -->
20230508/134350.422 - U00009909 TRACE: (Send to Server UC4IN2#WP003)                                           0x7ffe05e67a20 000327
                                00000000  30303030 30333237 5543343A 676C6F62  >00000327UC4:glob<
                                00000010  616C3030 314E4154 20202020 20202020  >al001NAT        <
                                00000020  20202020 20202020 20202020 20202020  >                <
                                00000030  F71401B2 B7644C4D 51315750 00000000  >÷..²·dLMQ1WP....<
                                00000040  00000000 00000053 69676E61 6C203131  >.......Signal 11<
                                00000050  20285345 47562920 61742028 6E696C29  > (SEGV) at (nil)<
                                00000060  20776974 6820636F 64652031 00000000  > with code 1....<
                                00000070= 00000000 00000000 00000000 00000000  >................<
                                00000140  00000000 000000                      >.......<
20230508/134350.422 - ActionSend <-- (OK)

As soon as the related Events are stopped, the issue does not occur anymore.

Environment

Release : 21.0.x

Component: Automation Engine

Cause

This looks to be caused by events that were activated months or years prior to the issue occurring and in some cases, when events are running on older versions and service packs of agents

Resolution

Workaround:

The following actions can be taken to help mitigate this issue:

  • Be sure the agents executing the events are upgraded to a current version such as 12.3.9
  • Stop the events completely
  • Start the events again

Solution:

Update to a fix version listed below or a newer version if available.

Fix version:
Component(s): Automation Engine
Automation.Engine 21.0.8 - Available

Additional Information

If the issue still occurs after upgrading the Agents executing the Events, please open a case with Technical Support referencing this article and case AE-32438.

Bug ID: AE-32224

Public description: A problem has been fixed where file events on old agents (version < 12.1) caused the server to crash.

Bug ID: AE-32592

Public description: A problem has been fixed where the Automation Engine could crash when starting a file transfer on a source agent, which was not reachable at that moment.