[Dar-discussions] Binary delta in 2.6.0
For full, incremental, compressed and encrypted backups or archives
Brought to you by:
edrusb
|
From: gulikoza <gul...@us...> - 2018-12-25 20:43:44
|
Hi Denis,
Congratulations on dar 2.6 release :-)
Thank you for all your hard work all these years!
I really enjoy working with dar, especially because of your attention to
detail which makes dar very versatile tool.
Unfortunately I didn't really have the time yet to test the binary delta
feature before the final 2.6.0 release, but it's just in time to make my
yearly full backups with the signatures now and start testing binary deltas
on production, haha ;-)
Just to let you know, there is an issue compiling dar 2.6 in RHEL-7 due to
gcc 4.8.5:
In file included from i_archive.hpp:35:0,
from archive.cpp:31:
erreurs.hpp:110:15: error: function 'libdar::Egeneric::niveau&
libdar::Egeneric::niveau::operator=(libdar::Egeneric::niveau&&)' defaulted
on its first declaration with an exception-specification that differs from
the implicit declaration 'libdar::Egeneric::niveau&
libdar::Egeneric::niveau::operator=(libdar::Egeneric::niveau&&)'
niveau & operator = (niveau && ref) noexcept = default;
.
(^ and so on, basically all the prototypes with noexcept = default)
I understand that gcc 4.8.5 has a very limited (buggy) support for c++11,
but unfortunately I guess this means dar 2.6 will not be available in
EPEL-7.
It should probably also be stated in the documentation that gcc 4.8 is no
longer supported (if this was the intention).
I was able to compile it with devtoolset-7 (gcc 7.3.1), create a RPM (along
with statically linked libthreadar) and the resulting RPM seems to work fine
for now on vanilla Centos 7 (without the newer gcc and libs installed).
Looking forward, I see that dar is using RS_DEFAULT_BLOCK_LEN which is just
2048 bytes. If the signatures will be switched to Blake2, this means the
signature will be 36 bytes (256-bit blake2 + 4 bytes crc32) for each 2k
block. This might be a bit too much for larger files and I was thinking that
instead of having (yet another) command line option for that, dar might try
to optimize the block_len based on file size. I don't see the need for 2k
signatures on multi-gigabyte files, where 8k or 16k signature might be just
ok and a lot smaller. I have a proof of concept patch for that already, for
instance with the 2GB file the blake2 catalog size is reduced from 36MB to
9MB with 8k blocks (default dar catalog with md4 signature would be around
18MB for the same file). I can send the patch to github if needed ;-)
And to nitpick a bit, the man page is not mentioning [Delta] in the [data]
displayed fields. :-)
Thanks again and Happy Holidays!
Regards,
gulikoza
|