Craig Barratt wrote:
Yeah, I thought about that after I sent the email. The checksum/hash
values aren't necessarily same algorithm.
I'm currently perusing the sources to see if there is a way to work
around that. Has anyone already investigated this? Am I headed down a
Yes you are.
When rsync transfers the file list it doesn't contain file checksums
(unless you specify --checksum, which File::RsyncP doesn't support).
Also, the rsync whole-file checksum is different to the BackupPC
pool checksum, so it isn't useful for trying to find files in the
It would therefore require an rsync whole file checksum => pool
checksum lookup table/cache. Which I've seen references to...
But that caching must be only for the life of that backup job?
One could store the rsync checksum in the pool file, but you still need
to generate a quick lookup table. Or an alternate file hierarchy using
rsync checksums (yuck!). Maybe a berkeley DB using tied hashes? But
you'd want a way to remove trashed pool items and you'd need to handle
write contention, because I'd assume you'd want a shared table. Even
if all that is possible, is it possible to interface into the RsyncP
module a way to say, "Oh, hey, I have a file that matches that whole
file checksum right here"?