configure splitting of big files and versioning of big files
1) Harvest seems to split files to checkin into chunks when the file size is larger than 500 megabytes.
Can this size be configured?
2) Harvest creates a new version of a big file, even when the file has not changed. This is a waste of space in the database.
Will this be changed?
Yes ,this can be configured using a parameter in HServer.arg
Now the chunksize will be 300MB
If you have a large file of about size 800MB then it will be split into 3 chunks
Size of chunk 1 = 300 MB
Size of chunk 2 = 300 MB
Size of chunk 3 =200 MB
Total size will be 800MB
-maxblobsize works for sizes less than 500 MB
500 MB is the size of one chunk of large file chosen for large file design consideration.
Even though the large file is split into chunks.There is no reason that it would impact any of the functionality in any manner.
This was for a better timing response, easy data handling and let the user be informed of the progress.
It would not have any impact.
If 2 GB is processed in one go, wait time is very huge for the end user, database , application responsiveness and performance
Split into chunks makes it easier for all of the activities involved.
In our QA cycle we have tested until 12 GB size data of large files
There is no way to check in files of 500 MB to 2 GB size without splitting it.
[a].This is by design.There is no delta comparison for large files which is most of the times binary data.
[b]Large files are generally stored as artifacts, say images or even huge digital data files or builds.Frequency of large files check out is not high as per the design considerations.
[c]There is no manual provision to change/edit the large files in harvest .
The change needs to be through tools or some tools which generate a new large file again.
[d]If you need to change the large file ,then after check out ,the new file needs to be manually replaced into the checked out location and then checked into the repository to create the next version of the large file
[e]If there is no change ,then check out a large file is a costly transaction and perhaps need to be avoided wherever possible.
[f]It is correct to say that even when there is no change ,it creates a new version -this is by design as there is no delta calculation of binary files but to keep track of the check out activity.
[g]If you have a requirement to check out the large files regularly even if there is no change ,we would recommend to move it to another folder which is not a part of frequent check out process.