File level deduplication
Brought to you by:
aventin
Hi,
it would be nice to have file level deduplication. Maybe you can create an MD5 signature of every file, once in the 1st read. Then, if a file is renamed or moved to a different folder, you don't need to backup that file since its content was not changed.
Another option is to use the size/date in order to create or not an MD5 signature.
Many thanks
Oliver
What about using a target drive with a deduplication filesystem like opendedupe, lessfs or zfs? But nevertheless also bacula >5 uses file based deduplication, or BackupPC uses something similar which works with hardlinks..
Would it be safe to fun fslint over an Areca archive as an out-of-band deduplication solution? That would find and hard-link together duplicates (on a file level) regardless of filesystem. Since the contents of files in archive folders (/backup/data/folder/1234567890/ aren't modified (only deleted), I THINK it should be safe?