DUAS: Exchanger slowness "Error 600 getting node" or records remain in u_fecd60 or u

Products

CA Automic Dollar Universe

Issue/Introduction

At certain times of the day the following kind of errors are written into the universe.log :

|ERROR|X|IO |pid=p.t| o_module_exc_cycle_emissi | Error 600 getting node NODE_NAME information 
|ERROR|X|IO |pid=p.t| o_io_cache_data_provider_ | Error getting Node <NODE_NAME>: 600

This started to occur after deleting a Node of the same Company from the UVMS and / or when replacing the Node of Residence of a given Management Unit to point to a different node.

Another symptom is:

Network exchanges and Launch generation between different nodes take very long and may eventually completely stop Jobs from being launched (Jobs in Launch Wait / Event Wait).

Additionally, the u_fecl60.dta and u_fecd60.dta can have records that are never purged from them after launching a multi-node session with a internode uproc dependency that works fine.

Environment

Component: Dollar Universe

Version: 6.x and 7.x

Cause

Dollar Universe stores the Awaited Job Events into the file u_fmev60.dta of the concerned area.

If at a given time the old node is decomissioned / replaced but the Event Wait uproc is not cancelled, the entries (XA lines in u_fmev60.dta) remain valable and every time the Uproc is executed. its Job Event will try to be sent to the old Node and populate the exchange files (u_fecl60 and u_fecd60), but as this node can't be found, the error message is logged.

In case nodes are being upgraded from version 5 to version 6, we advise to reinitialize u_fmev50 / u_fecd50 / u_fecl50 before upgrading so that old records are not imported into version 6.

Resolution

Above error message will only appear once per execution, then exchanger will delete the records related to non existent nodes and only the last execution will be stored into the exchanger data files.
In order to completely get rid of the messages, you should reinitialize both the Awaited Job Events data file (u_fmev60) and the Exchanger data files (u_fecl60 and u_fecd60).

We advise to perform this operation in case some lines XA from long time ago are stored into the u_fmev60.dta file.

Caution: all Uprocs in Event Wait waiting for a Uproc Execution on the impacted node should be cancelled as the Job Event will not be transmitted after this operation.
Please apply either Method 1 or Method 2, in case the file u_fmev60.dat cannot be reinitialized due to many remote nodes waiting for Uproc Executions.

Method 1: manual purge of the file u_fmev60.dta + reinitialization of exchanger data files

Procedure to manually purge the u_fmev60.dta:
1) Stop the node: unistop

2) Backup u_fmev60.dta file of the concerned area

3) Edit u_fmve60.dta and delete the lines starting by XA (X for area EXP, else replace X by S,I or A) that contain the string " 201YMM" where YMM corresponds to a date in the past longer than the date range you keep in your history.

4) Reinitialize the exchanger data files of the concerned areas ( replace X by S,I or A if other area impacted):
uxrazfic u_fecd60 X
uxrazfic u_fecl60 X

5) Run an offline reorg: unireorg

6) Start the node: unistart

Method 2: reinitialization of awaited job events + exchanger data files

1) Stop the node: unistop

2) Backup u_fmev60.dta file of the concerned area

3) Reinitialize the following data files of the concerned areas ( replace X by S,I or A if other area impacted):
uxrazfic u_fmev60 X
uxrazfic u_fecd60 X
uxrazfic u_fecl60 X

4) Run an offline reorg: unireorg

5) Start the node: unistart

Additional Information

Since version 6.10.41, there is no longer need for manual purge of the u_fmev60.dta as the cause of the non purged records has been fixed and a tool has been delivered (uxpurevt) to purge the file u_fmev60.dta properly.

Please check this article or an example here for the details.