|
From: James T. <zak...@ma...> - 2016-05-04 21:04:58
|
> On 4 May 2016, at 21:48, wki...@gm... wrote: > > i think this is what i've been missing... i have been dreading testing with and > without terrasync because of updated files not being seen by the "database"... > that meant that something had to be redone which generally meant downloading all > again with whatever tool was used to create the database (svn, terramaster, > etc)... when running with terrasync, the missing or lack of updating the > .terrasync_cache files meant, again, downloading all again... with terrasync, it > means a whole lot of flying if one wants the whole world scenery... that will > lead to scenery not being updated until the next time it is overflown... > > are these SHA1 hashes calculated on the fly as needed? > > how will we know if the SHA1 hashes are the same for a file? are the hashes > stored in a file easily accessed so they can be compared without having to > download and then throw away? > > is it only the data stored in the file that is used in the calculation or do > other file attributes also come into play (time, date, size)? > > are the files going to be stamped with the same datetime stamp as they have on > the server? > > are SHA1 hash clashes going to be a problem? The SHA1 hashes are what’s in the .dirindex files, both on the server and in the local tree. We do not synchronise time-stamps with the server copies because it complicates (makes much less efficient) mirroring on the server side. I am fairly confident hash clashes are not an issue, if the Git folks don’t think they are. We explicitly do not rely on any custom or particular HTTP headers, to maximise the chance of ‘just working’ with different HTTP providers and CDN options in the future. The structure of the .dirindex files is text based and hopefully trivial to anyone who cares to look at them; they are generated on the server side by some scripts Torsten wrote. The code uses basically identical logic to Git to avoid recalculating SHA hashes for the entire repository; if any of various stat() fields, most especially size or mod-time, change, we re-compute the hash for that file on disk. Whenever a file’s hash on disk differs from the what the .dirindex file says it should have, we download the file. That’s pretty much a complete description of the synchronisation model. Hence, it’s safe to copy files into the tree using any tool, or modify then - this will cause the stat() data to change, which will cause the code to recompute the SHA hash, which will trigger a re-download to the server-side copy if the SHA does not match. Note there is no provision to have modified (dirty) files inside the tree; they will always be over-written when next checked. > > > if possible, it would be nice to keep the SVN server in operation so that folks > can easily update all the tiles they have in a few minutes without the sim > having to do it... this will keep terramaster working since it is also a SVN > tool... it can also update the .terrasync_cache files whereas the raw command > line SVN cannot… > > it would be nice to reclaim 85Gig of space that my .svn directory is taking up > right now... the loss of the mega-ultra-easy updating is something else entirely... Mixing ‘real’ SVN (command line) with the build-in SVN-based terrasync has not worked in years. You might persuade the two systems to use the same location but they will perpetually confuse each other. The entire reason to avoid using real SVN, aside from the deployment issues of libsvn, was to avoid the space occupied by the .svn directory. If you use built-in SVN terrasync, you don’t pay the 85Gig cost. Making a stand-alone tool which updates an entire repository using the HTTP protocol is trivial - indeed, I created about 80% of such a tool already, and it’s committed to Simgear. A fairly small amount of additional command line options should make it do precisely what you need in terms of updating a series of directories. if you want assistance with improving the tool, please just ask. Kind regards, James |