Re: [BackupPC-users] Using rsync for blockdevice-level synchronisation of BackupPC pools

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Les Mikesell wrote at about 10:14:05 -0500 on Wednesday, September 2, 2009:
 > Pieter Wuille wrote:
 > >
 > > To overcome this issue, i wrote a perl/fuse filesystem that allows you to
 > > "mount" a block device (or real file) as a directory containing files
 > > part0001.img, part0002.img, ... each representing 1 GiB of data of the
 > > original device:
 > > 
 > >   https://svn.ulyssis.org/repos/sipa/backuppc-fuse/devfiles.pl
 > > 
 > > This directory can be rsynced in a normal way with an "ordinary" directory
 > > on an offsite backup. In case a restore is necessary, doing
 > > 'ssh remote "cat /backup/part*.img" >/dev/sdXY' (or equivalent) suffices.
 > > Although devfiles.pl has (limited) write support, rsync'ing to the resulting
 > > directory is not yet possible - maybe i can try to have this working if
 > > people have a need for it. This would allow restoration by simply rsync'ing
 > > in the opposite direction.
 > > Doing the synchronisation in groups of 1GiB prevents rsync from searching
 > > too far, and splitting it in multiple files allows some parallellism
 > > (sender transmitting data to receiver, while receiver already checksums
 > > the next file; this is heavily limited by disk I/O however).
 > 
 > Thanks for posting this.  I've considered a very similar approach using 
 > a VMware .vmx image file using the options to pre-allocate the space and 
 > segment into chunks as an intermediate that would be directly usable by 
 > a vmware guest.  I'm glad to hear that the rsync logistics would be 
 > practical.
 > 
 > > In our case, the BackupPC pool is stored on an XFS filesystem on an LVM
 > > volume, allowing a xfsfreeze/sync/snapshot/xfsunfreeze, and using
 > > devfiles.pl on the snapshot. Instead of xfsfreeze+unfreeze, a backuppc
 > > stop/umount + mount/backuppc start is also possible. If no system for making
 > > snapshots is available, you would need to suspend backuppc during the whole
 > > synchronisation.
 > > In fact, the BackupPC volume is already encrypted on our backup server
 > > itself, allowing very cheap encrypted offsite backups (simply not sending
 > > the keyfile to the remote side is enough...)
 > > 
 > > The result: offsite backups of our 400GiB pool, containing 350GiB data, of
 > > which about 2GiB changes daily, is synchronised 5 times a week with offsite
 > > backup in 12-15 hours, requiring nearly no bandwidth. This seems mostly
 > > limited by the slow disk I/O on the receiver side (25MiB/s).
 > > 
 > > Hope you find this interesting/useful,
 > 
 > The one thing that would bother me about this approach is that you would 
 > have a fairly long window of time while the remote filesystem chunks are 
 > being updated.  While rsync normally creates a copy of an individual 
 > file and does not delete the original until the copy is complete, a 
 > mis-matched set of filesystem chunks would likely not be usable.  Since 
 > disasters always happen at the worst possible time, I'd want to be sure 
 > you could recover from losing the primary filesystem (site?) in the 
 > middle of a remote copy.  This might be done by keeping a 2nd copy of 
 > the files at the remote location, keeping them on an LVM with a snapshot 
 > taken before each update, or perhaps catting them together onto a 
 > removable device for fast access after the chunks update.

You could also could try using rsync with --link-dest which would
create a 2nd copy that hard links to blocks that are the same and only
copies in new blocks. With luck some of the blocks might be the same,
saving you some storage vs. a full 2nd copy.