Craig Barratt wrote:
I'm currently perusing the sources to see if there is a way to work
around that.  Has anyone already investigated this?  Am I headed down a

Yes you are.

When rsync transfers the file list it doesn't contain file checksums
(unless you specify --checksum, which File::RsyncP doesn't support).
Also, the rsync whole-file checksum is different to the BackupPC
pool checksum, so it isn't useful for trying to find files in the

Yeah, I thought about that after I sent the email.  The checksum/hash values aren't necessarily same algorithm.

It would therefore require an rsync whole file checksum => pool checksum lookup table/cache.    Which I've seen references to...

But that caching must be only for the life of that backup job?

One could store the rsync checksum in the pool file, but you still need to generate a quick lookup table.  Or an alternate file hierarchy using rsync checksums (yuck!).   Maybe a berkeley DB using tied hashes?  But you'd want a way to remove trashed pool items and you'd need to handle write contention, because I'd assume you'd want a shared table.  Even if all that is possible, is it possible to interface into the RsyncP module a way to say, "Oh, hey, I have a file that matches that whole file checksum right here"?