From: Les M. <les...@gm...> - 2009-12-07 19:09:07
|
Harald Amtmann wrote: > So, for anyone who cares (doesn't seem to be anyone on this list who noticed), I found this post from 2006 stating and analyzing my exact problem: > > http://www.topology.org/linux/backuppc.html > On this site, search for "Design flaw: Avoidable re-transmission of massive amounts of data." It's documented behavior, so not a surprise. > 5. Now I make a second incremental back-up of home and home1. Since I have already backed up these two modules, I expect them both to be very quick. But this does not happen. In fact, all of home1 is sent in full over the LAN, which in my case takes about 10 hours. This is a real nuisance. This problem occurs even if I have this in the config.pl file on server1: > $Conf{IncrFill} = 1; You have the wrong expectations. Do you have a reasonably current version, and did you read the section on $Conf{IncrLevels} in http://backuppc.sourceforge.net/faq/BackupPC.html? You can also just do full runs instead of incrementals - they take a long time as the target has to read the files to verify the block checksums, but not a lot of bandwidth. > The cure for this design flaw is very easy indeed, and it would save me several days of saturated LAN bandwidth when I make back-ups. It's very sad that the authors did not design the software correctly. Here is how the software design flaw can be fixed. > > 1. When an rsync file-system module module1 is to be transmitted from client1 to server1, first transmit the hash (e.g. MD5) of each file from client1 to server1. This can be done (a) on a file by file basis, (b) for all the files in module1 at the same time, or (c) in bundles of say, a few hundred or thousand hashes at a time. The rsync binary on the target isn't going to do that. > 2. The BackupPC server server1 matches the received file hashes with the global hash table of all files on server1, both full back-up files and incremenetal back-up files. Aside from not matching rsync, the file hashes have expected collisions that can only be resolved by a full data comparison. And there's no reason to expect all of the files in the pool to have been collected with an rsync transfer method. -- Les Mikesell les...@gm... |