Resend to the mailing list...
On Wed, Jan 18, 2012 at 03:08:35AM +0000, Graham Keeling wrote:
> On Mon, Jan 16, 2012 at 10:25:19PM -0500, Ben Clay wrote:
> > Graham-
> Sorry for the delay in replying.
> > I see on the FAQ that the storage data deduplication has been implemented
> > versions > 1.2.2. Is this within a single client's backup, or across
> > clients? IE, if client A has file X and client B has file X, does burp
> > store just one copy? OR, does it just dedup if client A has two copies of X
> > stored in multiple places?
> It stores a copy for each client, then the 'bedup' process runs (if you
> configure it), finds the duplicates (which can be in the same client, or across
> different clients), and hardlinks them to save space. You can
> deduplicate all clients in this way, or you can put them into groups so that
> only similar clients get deduplicated together.
> You cannot hardlink files across partitions, so if you have two clients on
> different partitions, it won't deduplicate them.
> Reading on to the end, I would assume that this means that you won't be
> able to deduplicate across your per-client qcow2 images. But you can inside
> each individual one.
> > Second and related, in the backup directory structure on the server, I see
> > one top-level directory for each client. Are all files for each client
> > contained within its own top-level subdir (ie, no cross links)?
> All the files for each client are contained within its own top level directory
> until you run the deduplicator, at which point, files in one client top level
> directory may be hardlinked to those in another client top level directory.
> Even then, each top level client directory is still self-contained (due to
> the nature of hardlinks).
> > Third, is it possible to get the client name passed to the pre and post
> > scripts? I see the examples in burp-server.conf with static args, but I'd
> > like to know if I can get the current client after it authenticates.
> I assume that you are talking about server side scripts.
> (Reading on to the end confirms that you are).
> It appears to be an oversight that I have not given the server side script
> the client name as a 'reserved' argument. I will fix that soon.
> > I am using greyhole (http://www.greyhole.net/) to do JBOD concatenation and
> > would like to use burp on top of it, but it's not playing nice with burp's
> > soft and hard links (since greyhole relies completely on soft links of its
> > own). I was thinking about creating a qcow2 image file for each client, and
> > using server_script_pre and server_script_post to mount / unmount the given
> > client's image when it connects. IE:
> > backup_dir
> > |-- clientA -> on-demand mountpoint for /greyhole/share/path/clientA.qcow2
> > |-- clientB -> on-demand mountpoint for /greyhole/share/path/clientB.qcow2
> > |-- clientC -> on-demand mountpoint for /greyhole/share/path/clientC.qcow2
> > The on-demand mount / unmount is necessary for greyhole to perform its own
> > balance/duplication tasks, since it risks corruption if it moves open files.
> > Qcow2 will grow in size as burp adds more files, up to some max which I
> > would set very high (ie 16TB for ext4 with 4k blocks).
> > I can trivially script this up (I can even provide a how-to if you're
> > interested) but I need the above questions answered RE: burp server.
> What is the reason for having individual qcow2 images instead of one big
> one? Is it to restrict the space that each client uses?
> Give me a day or so, and I can fix the passing the client name to the server
> script, and you should be able to try this out, if you are still interested.
> It is very simple, but it needs me to get some spare time.