There are two prerequisites before you can identify potential bottlenecks in the NES_DISTRIBUTION phase.
- You must identify the MD5 for the artifact file. The following KB Article may help if you do not have a way of identifying this more easily: How To: Identify How Long The Artifact Retrieval Process Took
- Note: Artifact Repository's usually have this information as a property of the artifact.
- You must have the logs folder from all of your execution servers.
- If your Management, Execution Servers and/or Agents are on Linux then you might want to consider using the Nolio RA Collect Logs Scripts available on the Nolio Release Automation Community.
Once you have the MD5 and logs from your Execution Servers you can begin by searching the logs for:
GETTER_<MD5_of_artifact_file> got chunk,
./nes_serverA/nimi.log.1: 2020-08-12 07:38:14,441 [FileTransferWorker-2848] DEBUG (com.nolio.nimi.filetransfer.impl.AbsFileTransferWorker:322) - GETTER_C70EC4482650B95173A5EC9479827203 got chunk, [10,383,360] out of [310,248,458] - [3 %].
The set of message found from the search string above will yield the chunks of data received by the NES. From these messages you can determine how long it took for a NES to receive all of the chunks of data.
If you identify a server that is experiencing a slow transfer then you can confirm which remote server was supplying the file by searching for:
GETTER_<MD5_of_artifact_file> got the route
./nimi.log.1: 2020-08-12 12:35:33,953 [DiscoveryWorker-18210] DEBUG (com.nolio.nimi.filetransfer.impl.AbsFileTransferWorker:490) - GETTER_C70EC4482650B95173A5EC9479827203 got the route. The real source is not directly accessible, will use [nid:es_ServerB] instead.
If you identify a server (ServerA) that experiences a slower than expected transfer and you have the remote server name (ServerB) providing the file, along with timeframes, it offers details you can use to discuss with your network team to identify any possible network errors, congestion, bandwidth or other errors that might result in a slower than expected timeframe.