Description :During a network outage that lasts some time, the universe.log is flooded with exchanger error messages. After the network has recovered, cross node sessions start to run.However, the jobs that should run on remote nodes only start 20 minutes later.
Environment
OS: All
Cause
Root Cause: The reason for this delay is due to the exchanger data files are filled with pending requests. During the network outage, the DUAS is still working, and trying to send out several requests to remote nodes. However, it could not so more and more requests are accumulated in the exchanger data files.
Resolution
This kind of delay should dissipate once the network is stable.