Re: [BackupPC-users] Advice on creating duplicate backup server

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

just playing devils advocate here as this conversation has already chosen
its direction.

would it be reasonable to compare two filesystems based on their journal?  I
am assuming that essentially all backuppc installations are on a journaled
filesystem or could be upgraded to one.  wouldnt replaying that journal to a
tar file, sending the tar file to the remote host, then restoring the
journal be pretty efficient?

I have no experience with this, just wanted to throw another concept out
there.

On Mon, Dec 8, 2008 at 10:49 PM, Jeffrey J. Kosowsky
<bac...@ko...>wrote:

> Holger Parplies wrote at about 04:10:17 +0100 on Tuesday, December 9, 2008:
>  > Hi,
>  >
>  > Jeffrey J. Kosowsky wrote on 2008-12-08 09:37:16 -0500 [Re:
> [BackupPC-users] Advice on creating duplicate backup server]:
>  > >
>  > > It just hit me that given the known architecture of the pool and cpool
>  > > directories shouldn't it be possible to come up with a scheme that
>  > > works better than either rsync (which can choke on too many hard
>  > > links) and 'dd' (which has no notion of incremental and requires you
>  > > to resize the filesystem etc.).
>  >
>  > yes, that hit someone on the list several years ago (I don't remember
> the
>  > name, sorry). I implemented the idea he sketched (well, more or less,
> there's
>  > some work left to make it really useful).
>  >
>  > > My thought is as follows:
>  > > 1. First, recurse through the pc directory to create a list of
>  > >    files/paths and the corresponding pool links.
>  > >    Note that finding the pool links can be done in one of several
>  > >    ways:
>  > >    - Method 1: Create a sorted list of pool files (which should be
>  > >      significantly shorter than the list of all files due to the
>  > >     nature of pooling and therefore require less memory than rsyn)
>  > >     and then look up the links.
>  >
>  > Wrong. You need one entry per inode that points to an arbitrary path
> (the
>  > first one you copy). Every file(*) is in the pool, meaning a list of all
> pool
>  > files is exactly what you need. A different way to look at it: every
> file with
>  > a link count > 1 is a pooled file, and it's these files that cause
> rsync&co
>  > problems, not single link files. (Well, yes, rsync pre-3 needed a
> complete
>  > list of all files.)
> OK. I had assumed (wrongly) that rsync needed to keep track of each
> file that is hard-linked, not just one copy.
> Still, there are some savings by knowing that you can find your one
> copy in the pool and you don't have to look at all through the pc tree.
>  >
>  > (*) Files that are not in the pool:
>  >     1.) 0-byte files. They take up no file system blocks, so pooling
> them
>  >         saves only inodes. Not pooling them makes things simpler.
>  >     2.) log files (they get appended to; that would make pooling
> somewhat
>  >         difficult; besides, what chance is there of a pool hit?),
>  >         backups files (including backups.old)
>  >     attrib files are pooled, contrary to popular belief, and that makes
>  >     sense, because they are often identical with the same attrib file
> from
>  >     the previous backup(s).
> Yes. I am aware of this from the routines I wrote to check/fix pool
> consistency and missing links to the pool
>  >
>  >
>  > The algorithm I implemented is somewhat similar:
>  > 1.) Walk pool/, cpool/ and pc/, printing information on the files and
>  >     directories to a file (which will be quite large; by default I put
> it
>  >     on the destination pool FS, because there should be large amounts of
>  >     space there).
>  > 2.) Sort the file with the 'sort' command. The lines in the file are
>  >     designed such that they will be sorted into a meaningful order:
>  >     - directories first, so I can create them and subsequently not worry
>  >       about whether the place I want to copy/link a file to already
> exists
>  >       or not
>  >     - files next, sorted by inode number, with the (c)pool file
> preceeding its
>  >       pc/ links
>  >       The consequence is that I get all references to one inode on
> adjacent
>  >       lines. The first time, I copy the file. For the repetitions, I
> link to
>  >       the first copy. All I need to keep in memory is something like one
> line
>  >       from the file list, one "previous inode number", one "file name of
>  >       previous inode".
>  >     'sort' handles huge files quite nicely, but it seems to create large
>  >     (amounts of) files under /tmp, possibly under $TMPDIR if you set
> that (not
>  >     sure). You need to make sure you've got the space, but if you're
> copying a
>  >     multi-GB/TB pool, you probably have. My guess is that the necessary
> amount
>  >     of space roughly equals the size of the file I'm sorting.
>  > 3.) Walk the sorted file, line by line, creating directories and copying
> files
>  >     (with File::Copy::cp, but I plan to change that to PoolWrite, so I
> can add
>  >     (part of) one pool to an existing second pool, or something that
>  >     communicates over TCP/IP, so I can copy to a different machine) and
>  >     linking files (with Perl function link()).
>  >     In theory, a pool could also be compressed or uncompressed on the
> fly
>  >     (uncompressed for copying to zfs, for instance).
>
> Yes... I was thinking very similarly though you have clearly fleshed
> it out and coded it in more detail. I would love to see the code when
> it's stable and offer feedback or testing if that is helpful.
>  >
>  >
>  > Once again, because people seem to be determined to miss the point: it's
> *not*
>  > processing by sorted inode numbers in order to save disk seeks that is
> the
>  > point, it's the fact that the 'link' system call takes two paths
>  >
>  >     link $source_path, $dest_path; # to use Perl notation
>  >
>  > while the 'stat' system call gives you only an inode number. To link a
>  > filename to a previously copied inode, you need to know the name you
> copied it
>  > to. A general purpose tool can't know when it will need the information,
> so it
>  > needs to keep information on all inodes with link count > 1 it has
> encountered.
>  > You can keep a mapping of inode_number->file_name in memory for a few
> thousand
>  > files, but not for hundreds of millions. By sorting the list by inode
> number,
>  > I can be sure that I'll never need the info for one inode again once
> I've
>  > reached the next inode, so I only have to keep info for one file in
> memory,
>  > regardless of how many I'm copying. The difficult part is now the
> 'sort', but,
>  > again, the 'sort' command is good at handling huge files - probably
> without
>  > limit to the file size.
>
> Exactly.
>  >
>  >
>  > So, what's the problem? Well, I used it once, because I needed to copy a
> pool.
>  > It seemed to Work For Me (tm), but I'm unsure how to verify the result,
> aside
>  > from randomly looking at a few files and hoping the 99.99999% I didn't
> look at
>  > are ok too.
>
> Why not use rsync to test it? (with the -n flag for dry-run, -v to see
> changes it would have made, and -delete to check both directions)
> And if your pool is too big, just test it on a smaller example since
> unlike rsync, the method you and I are discussing should be pretty
> scalable so if it works on a reasonably complex 10GB pool, it should
> also work on a TB one... (hopefully)
>
>  > It's far from complete. Its usefulness is limited, as long as I
>  > can't copy to a remote machine. It only handles the pool/, cpool/ and
> pc/
>  > directories, the rest needs to be copied by hand. There is debug output
> for
>  > cases I hope to not encounter but which may be present in other people's
>  > pools. I think I'm still missing a chown() or two.
>
> What more is there beyond pool/cpool/pc other than Trash which you
> presumably don't need to synch and .ssh which is small and should
> rarely change?
>  >
>  > Let's see if I can find the figures. My pool was 103 GB, 10 million
> directory
>  > entries pointing to 4 million inodes. Copy from local disk to ISCSI
> target
>  > over shared 100 MBit network in 10:45 hours. Of this time, 15 minutes
> were
>  > spent examining cpool (pool was empty), 73 minutes examining pc/, 165
> seconds
>  > sorting file list (1.1GB) and 9:14 hours copying/linking.
>  > rsync might have worked for this pool, but I didn't test. I would be
> very
>  > curious how this scales to a 3TB pool though ;-).
>  >
>  >
>  > Question (to a tar expert named Craig or otherwise):
>  > Is it possible to create a tar stream with this structure (i.e. lots of
>  > directories, then file 1/2/3/123whatever with content, then several
> links in
>  > different directories under pc/ to this file, then next pool file and so
> on),
>  > or does a tar need to be sorted by directories?
>  >
>  > If it *is* possible, creating a tar stream instead of copying/linking
> would
>  > not be difficult, and then you could run
>  >
>  >     BackupPC_copyPool ... | ssh hostname tar xpf -
>  >
>  > (or via netcat, or even store the result on tape). Even merging into an
>  > existing pool could be split into a BackupPC_mergePool script which
> takes a
>  > tar stream and does whatever is necessary.
>  >
>  > >    - Method 2: Calculate the md5sum file path of the file to determine
>  > >             out where it is in the pool. Where necessary, determine
> among
>  > >     chain duplicates
>  >
>  > That's basically what BackupPC_tarPCCopy does.
>
> So why not use this? Or is it too slow since you need to create the
> path md5sum?
>  >
>  > >    - Method 3: Not possible yet but would be possible if the md5sum
>  > >      file paths were appended to compressed backups. This would add
> very
>  > >     little to the storage but it would allow you to very easily
>  > >     determine the right link. If so then you could just read the link
>  > >     path from the file.
>  >
>  > I believe this would speed up BackupPC_tarPCCopy by many orders of
> magnitude.
>  >
>  > > 2. Then rsync *just* the pool -- this should be no problem since by
>  > >    definition there are no hard links within the pool itself
>  > >
>  > > 3. Finally, run through the list generated in #1 to create the new pc
>  > >    directory by creating the necessary links (and for files with no
>  > >    hard links, just copy/rsync them)
>  >
>  > See BackupPC_tarPCCopy.
>  >
>  > > The above could also be easily adapted to allow for "incremental"
> syncing.
>  > > Specifically, in #1, you would use rsync to just generate a list of
>  > > *changed* files in the pc directory. In #2, you would continue to use
>  > > rsync to just sync *changed* pool entries. In #3 you would only act on
>  > > the shortened incremental sync list generated in #1.
>  >
>  > While you can limit your pc/ directory traversal to only a subset of all
>  > backups of a host (or all hosts, if you give a start date, for example),
> I
>  > don't quite see how syncing the pool should work. Remember that pool
> files
>  > with hash collisions may be renumbered. Is this supposed to be limited
> to
>  > re-creating an identical pool? Even then, renaming a pool file does not
> affect
>  > the pc/ links to it. Overwriting it with different content does. You
> would
>  > need to re-establish the correct links for existing backups too, or
> figure out
>  > how the source pool was changed and replicate the changes to the
> destination
>  > pool (rm foo_2 foo_3; mv foo_4 foo_2). This can be done, but not with
> rsync,
>  > as far as I can tell.
>
> Ahhhh... I see your point. Rsync won't realize that links have been
> renumbered but will instead treat them as new files.
>
> Maybe do the following:
> - Use rsync with dry-run flag to identify pool files that have changed
>  (based on size, name, time). Then sort the list by md5sum path
>  Then work through the list as follows:
>  - If new md5sum path, then just copy over to the backup pool
>  - If existing md5sum path then we need to deal with the following
>        subcases:
>                - New hash collision
>                - Renumbering
>                - New/changed rsync checksum or compression flag (but same
>                  file content)
>                - Deleted pool entry
>    To separate the cases, I would calculate the full file md5sum of
>        the content. Then renumber, then copy over new additions and
>        delete deletions. Then check for changes to compression flags &
>        rsync checksums.
>
> The key is that this can all be decoupled from the pc tree since the
> link itself is unchanged for any entry in the pc tree that stays.
>
> After this incremental sync of the pool, then you proceed to
> incrementally backup the pc tree. In general, this is nicely
> structured since a backup is either added in total or deleted in total
> (though this gets more complicated and would require a file-by-file
> examination of changes (or rsync) if you had used a routine like my
> BackupPC_deleteFile to modify or delete individual files from past
> backups)
>
> - If a backup has been deleted, then simply 'rm -rf' it (along with
>  its logs) from the pc tree
> - If a backup has been added, then use the methods described above to
>  link in the new files to the already synchronized pool.
>
> Note that a potentially better idea would require structuring the pool
> naming convention to get rid of then need to renumber chains. Instead
> of giving hash collisions an ordinal subscript, jsut give each pool
> entry a name composed of the concatenation of the path md5sum plus
> another random 32 digit number. Then you would never have to renumber
> since their would be essentially a zero percent chance of collision.
> In that case you would be able to do a simple rsync on the pool.
> If
>  >
>  > > The more I think about it, the more I LIKE the idea of appending the
>  > > md5sums file paths to compressed pool files (Method #3)
>  >
>  > Yes, but the question is how often this information is needed. We're
> going to
>  > a lot of trouble to *eliminate* redundancy. Adding redundancy for a case
> every
>  > 100th user is going to encounter once in his life may not be warranted.
> Then
>  > again, it's a fixed amount per file and probably not enough to worry
> about ...
> It could be made optional whether or not to append this additional
> information (similar to how rsync checksums are optional).
>
> More generally, since the latest rsync protocol uses md5sums anyway
> (instead of md4), why not just bite the bullet and start using the
> full md5sum as the pool file name since we will be needing to
> calculate it anyway for protocol 30 rsync. This will get rid of hash
> collisions (for all intents and purposes) and also give us a built in
> checksum to use against file corruption (or changes). If the md5sum is
> then stored similar to how the rsync file digests are now stored, we
> won't need to add any storage and you could then use my Method #3.
>
> Hash collisions are really a pita since they introduce a bit of
> awkwardness. Getting rid of them would simplify the BackupPC_link
> algorithm and would eliminate the need for pool renumbering.
>
> Along these lines, I really wish that BackupPC didn't wait for the
> second backup of a file to store the rsync digest information since it
> would be nice to have the checksums always there in case you ever need
> to verify file integrity.
>
>
> ------------------------------------------------------------------------------
> SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
> The future of the web can't happen without you.  Join us at MIX09 to help
> pave the way to the Next Web now. Learn more and register at
>
> http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
> _______________________________________________
> BackupPC-users mailing list
> Bac...@li...
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>