Re: [BackupPC-users] BackupPC Pool synchronization?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Thu, Mar 7, 2013 at 8:15 AM, Holger Parplies <wb...@pa...> wrote:
>
> It's a bit more than one "part in the code". *New pool entries* are created
> by BackupPC_link, which would then be essentially unnecessary. That part is
> simple enough to turn off. But there's really a rather complex strategy to
> link to *existing pool entries*. In fact, without pooling there is not much
> point in using the Perl rsync implementation, for instance (well, maybe the
> attrib files, but then again, maybe we could get rid of them as well, if we
> don't use pooling).

The perl rsync understands the local compression - which may also be
better handled by the file system.  Clearly the snapshots of a growing
logfile could be stored more efficiently with a block level scheme -
but backuppc's checksum caching might be a win for non-changing files
in terms of processing efficiency.

> It really sounds like a major redesign of BackupPC if you
> want to gain all the benefits you can. Sort of like halfway to 4.0 :).
> Basically, you end up with just the BackupPC scheduler, rsync (or tar or just
> about anything you can put into a command line) for transport, and ZFS for
> storage. Personally, I'd probably get rid of the attrib files (leaving plain
> file system snapshots easily accessible with all known tools and subject to
> kernel permission checking) and the whole web interface ;-).

If anyone is designing for the future, I think it makes sense to split
out all of the dedup and compression operations, since odds are good
that future filesystems will handle this well and your backup system
won't be a special case.  Keeping 'real' filesystem attributes is more
of a problem, since the system hosting the backups may not have the
same user base as the targets,  the filesystem may not be capable of
holding the same attributes, and even if those were not a prioblem it
would mean the backup system would have to run as root to have full
access.

> Most others will
> want to be able to browse backups through the web interface, which probably
> entails keeping attrib files (and having all files be owned by the backuppc
> user, just like the current situation). Then again, 'fakeroot' emulates
> root-type file system semantics through a preloaded library.

That's interesting - it would be nice to have a user-level abstraction
where a non-admin web user could access things with approximately the
permissions he would have on the source host.

> Maybe this idea
> could be adapted for BackupPC to use stock tools for transport and get attrib
> files (and backuppc file ownership) just the same.
>
> ZFS is an interesting topic these days. It's probably best to gain some
> BackupPC community experience with ZFS first, before contemplating changing
> BackupPC to take the most advantage. Even with BackupPC pooling in place,
> significant gains seem possible.

Hmmm, maybe something even more extreme for the future would be to
work out a way to have snapshots of virtual-machine images updated
with block-level file pooling.   Then, assuming appropriate network
connectivity, you'd have the option of firing up the VM as an instant
replacement instead of rebuilding/restoring a failed host.

-- 
   Les Mikesell
     les...@gm...