Re: [BackupPC-users] Multiple backuppc server

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi,

Tino Schwarze wrote on 2009-07-08 10:11:43 +0200 [Re: [BackupPC-users] Multiple backuppc server]:
> On Tue, Jul 07, 2009 at 01:50:56PM +0100, Andy Brown wrote:
> > We've started to setup a large multiple server backuppc environment [...]
> > We've got a large 2TB nas at the back of it with gig connectivity.
> > Filesystem is LVM on top of OCFS2 so we have multiple front-end servers
> > with read/write. [...] The actual top/backup location is shared on the
> > main nas store.
> > [...]
> > 
> > Can anyone see any pitfalls with this?
> 
> You will run into lots of troubles [...]

actually, I'm not sure you will. I'd expect subtle corruption which you won't
notice until it's too late. Things like single files in backups containing the
wrong contents. There might be more obvious things like garbled status.pl
contents, which might make BackupPC crash or display (and use) incorrect
values. I wouldn't be surprised if each BackupPC instance would remove
information for hosts it doesn't know about (from status.pl, not the host
directories). Random things may or may not happen.

You might even be lucky and simply get away with it. Race conditions are
things waiting to happen, although they may turn out not to. I don't know and
it doesn't seem important to me either. Are you doing backups for the odd
chance of them being correct?

> BackupPC is not designed to support multiple instances accessing the same
> storage.

> There are
> processes like BackupPC_nightly which need to have exclusive access to
> the pool (e.g. no parallel BackupPC_link running).

While you might even be able to rule that out by some clever scheduling (and
some luck), there's no sane way to prevent more than one instance of
BackupPC_link from running.

> > The only strange thing I've
> > noticed is with the trashClean process, it seems to be trying to clean
> > things that the other server is creating/working on and failing with
> > "Can't read /var/lib/backuppc/trash/xxxx/home/blah/thing/file: No such
> > file or directory". It doesn't seem a major thing so I'm ignoring it
> > for now!

This is one example of a race condition. Two trashClean processes are
simultaneously trying to delete the same tree. Each file can only be deleted
once, so each trashClean will fail for an arbitrary number of files (and log
it, as it is unexpected). Obviously, running multiple trashClean processes on
the same file system is a waste of resources, i.e. slows things down
considerably compared to only one instance running. Multiple BackupPC_nightly
instances would be even more wasteful by far.

> What are you trying to accomplish by using multiple BackupPC instances?

This is an important question. You are probably assuming that your NAS can
handle more I/O than one server could generate. You are very likely wrong.
Concurrent write access to a file system needs to be synchronized. Your
cluster file system might do that, but it does at a cost.

Your bottleneck is very likely to be disk seeks rather than raw disk I/O
bandwidth. Concurrent independent access to your disk(s) will make that
even more so (which is why concurrent backups under one BackupPC server are
usually limited to a very small number), and I'd expect your cluster FS to
make your bottleneck thinner than it already is.

So, aside from not working, because BackupPC is not designed to support it,
you will probably be achieving the opposite of what you want. But that's just
my guess. You've got it running. What do your measurements say?

Regards,
Holger