aefuf process fails to start "no available APPL pgm structure"

Products

Gen Gen - Run Time Distributed

Issue/Introduction

The Gen 8.6 application comprises multiple funnels connecting to a daemon (aefad). There are also proxy applications which continually poll the server. When the system is taken down for maintenance it sometimes does not restart correctly because one of the funnel processes will fail. The funnel log file shows the following error:

02:44:46.810882==>myclose: end 02:44:46.811756==>aefuf: no available APPL pgm structure

Environment

Gen Transaction Enabler.

Cause

The problem is caused by a combination of factors.

The start up script for the application did not leave any time between starting each funnel process.
The continual polling by the proxy application resulting in runtime errors during start up.

Correcting either one of the above would address the problem. Tests performed in a situation where no polling application existed always resulted in a clean start up even with the no delay between the start of each component.

The error condition caused by the polling application was as follows:

03:11:38.142846==>aefad: USER: user05, Sock = 5, sock05, connected 03:11:38.144770==>rcvfhdr: FUNN: Sock = 5, prd03, started 03:11:38.150811==>aefad: USER: user06, Sock = 6, sock06, connected 03:11:38.151603==>rcvfhdr: FUNN: Sock = 6, prd03, started 03:11:38.157897==>aefad: USER: user07, Sock = 7, sock07, connected 03:11:38.159649==>rcvfhdr: FUNN: Sock = 7, prd03, started 03:11:38.176549==>aefad: USER: user08, Sock = 8, sock08, connected 03:11:38.178995==>rcvfhdr: FUNN: Sock = 8, prd03, started 03:11:38.189447==>rcvfhdr: USER: user1280, Sock = 1280, ossprd03, lterm001, started 03:11:38.192043==>GetGUIpgm: USER: comsv001, Sock = 1280, prd03, commsrvr, started 03:11:38.196587==>aefad: APPL: SLASRV, TRAN: TSLAMON, Sock = 9, connected 03:11:38.198191==>rcvpgm: APPL: SLASRV, TRAN: TSLAMON, Sock = 9, FHDR header data ?0010AAB4? 12 *.* 03:11:38.198279==>abortpgm: RCVERR: APPL: SLASRV, TRAN: TSLAMON, Sock = 9 03:11:38.198671==>xerrGUIpgm: ABORTPGM: $#XERR#$ ?000FAABC? 00000048 00000000 00000000 00242358 45525223 *...H.........$#XERR#* ?000FAAD0? 247c0e00 00000000 00000000 00000000 00000000 *$|..................* ?000FAAE4? 00000000 00000000 00000000 00000000 00000000 *....................* ?000FAAF8? 00373333 00000000 00000000 *.733........* 03:11:38.199511==>delpgm : APPL: SLASRV, TRAN: TSLAMON, Sock = 9 close, pid = 26194 03:11:38.200184==>aefad: USER: user10, Sock = 10, sock10, connected 03:11:38.220964==>rcvpgm: USER: user10, Sock = 10, sock10, remote close 03:11:38.221082==>abortpgm: RCVERR: USER: user10, Sock = 10, sock10 03:11:38.221551==>sigexit: got signal = 18 03:11:38.221979==>abendpgm: APPL: Sock=9, pid=26194, APPL END 03:11:38.222242==>myclose: USER: Sock = 10, localhost, user10, USER END 03:11:38.223492==>aefad: USER: user09, Sock = 9, sock09, connected 03:11:38.223678==>rcvpgm: USER: user09, Sock = 9, sock09, remote close 03:11:38.223737==>abortpgm: RCVERR: USER: user09, Sock = 9, sock09 03:11:38.223944==>myclose: USER: Sock = 9, localhost, user09, USER END

Note the time of the corresponding funnel error and the relative times:

03:11:38.193163==>Starting AEFUF, Advantage(tm) Gen release version: 66 03:11:38.222664==>myclose: end 03:11:38.223169==>aefuf: no available APPL pgm structure 03:11:38.223247==>Slots used: 0

Resolution

Introducing a sleep between starting the funnel processes removed the problem e.g.

aefuf -i 10032 -c 10030
sleep 2
aefuf -i 10033 -c 10030
sleep 2
aefuf -i 10034 -c 10030
sleep 2
aefuf -i 10042 -c 10031

The script already had a delay between the start of the daemon and the first funnel. The additional delays allow the connection between the daemon and each funnel to be correctly initialised before the next connection is attempted.