Errors in Universe log of node1:
|ERROR|X|IO |pid=pid.tid| owls_api_exchanger_create | Error 200 connecting to node <node2> port 10700
E.g. case:
In Node1 there was an uproc with dependency with uproc of node node2, but now this uproc is removed and there is no other reference in node1 with node2.
Node1 is trying to connect with not existing node2.
Release : 6.x
Component : DOLLAR UNIVERSE
The reason of the error messages appearance is because the other node id records still exist in the exchanger data files u_fecl60 u_fecd60 and u_fmev60
To get rid of the messages reinitialize the files u_fecl60 and u_fecd60 and errors should disappear.
Apply the following procedure in order to reinitialize the corrupted exchanger data files:
- Shutdown the Dollar Universe and check that all process are down:
unistop
- Backup the exchanger data files of the impacted area (u_fecd60.dta, u_fecl60.dta)
- Reinitialize the exchanger data files of the area by executing the following commands on bin ( replace X by A,S, or I in case the issue appears on a different area):
uxrazfic u_fecd60 X
uxrazfic u_fecl60 X
- Apply an offline reorganisation:
unireorg
- Restart Dollar Universe:
unistart
Additionally, it's always a good idea to check that there are not a lot of old Awaited event records in the file u_fmev60.dta as that may also impact exchanger and launcher performances.
You can do that since version 6.10.71 with the command uxpurevt.
For example, if you want to check if you have Event Waits older than 2021 November 1st, you could check it with this command:
uxpurevt exp simulate 20211101
Example of the expected return when situation is ok:
$UNI_DIR_EXEC/uxpurevt exp simulate 20211101
*** Simulation flag found. No records will be deleted. ***
Processing /apps/du/600/TST600/data/exp/u_fmev60 ...
******************************************************
Simulation Output
******************************************************
Event type R (Realised) read = 1
Event type R (Realised) deleted = 0
Event type A (Awaited) read = 0
Event type A (Awaited) deleted = 0
Total events R & A read = 1
Total events R & A deleted = 0
duration: 0 s
If the value is other than 0, it means you have records that have their latest modification date anterior to the specified date so they are candidate for purge.
In this case, you should launch the command without the simulate flag to purge these records.
You should launch this command while the node is stopped, followed by a reorganization.
Example (replace the date and the area being affected according to your case):