Re: Windows Rsyncrypto
Brought to you by:
thesun
From: Shachar S. <rsy...@sh...> - 2005-07-26 12:15:43
|
Barry James wrote: >Shachar > >Thanks for the quick and comprehensive response. Will look out for the new >download. > > It took two months from the point of first reporting. I'm afraid I cannot boast "quick" on this one :-(. >Am I right in thinking that, given the nature of compression (or gzip >compression at least) that repeating blocks are somewhere between unlikley >and impossible? > Unlikely - yes. Impossible - not at all. With "rsyncable", gzip only sees about 30KB back into the file. Any repetition that is more than that apart, and gzip will not spot it. Needless to say, this also means that it will likely (eventually) get compressed into the same compressed stream. That is, after all, what rsyncable compression and encryption is all about. > (and if not, would it not be a useful and simple matter to >design a 'top-up' compression algorithm that eliminates repeating blocks by >replacing them with a pointer - as I believe gzip does. This could even be >recursive?). > > No, it would not. Pointing out repeptitions is what the Lempel Ziv that gzip employs does. For both performance and compression optimization considerations, the size of the back window that gzip looks at is limited. If you design a revised lempel ziv that scans the whole the file, you will both have terrible performance AND hurt your compression ratio. The reason for the apparent illogic is that storing the pointer takes space too. The bigger the potential pointer needs to be, the greater the space we need to reserve to store it. I can think of a compromize, where only blocks that come at the start of a gzip "block" are compared, and only if they are large enough (say, 10KB) will they be referenced. This is, however, somewhat in the far future, at the moment. It will, however, eliminate many of the attacks (but not all) currently available on the rsyncrypto model. >>It works excellent, so long as let rsyncrypto run it. >> >> > >Is this because the blocks used for encryption and comression need to be >aligned to get maximum rsyncability? > > Not aligned with one another, no. It's because the blocks have to be aligned with the same blocks prior to the change. Gzip only does that if you tell it "rsyncable". >>rsyncrypto runs gzip with the "rsyncable" flag. If you compress the >>files prior to encryption using this flag, you will make the compression >> >> > >ration slightly lower (for all sane files). If, however, you run gzip > >without "rsyncable", or compress in any other way, you will totally > >destroy the rsync friendliness of the system. In such a case, you are > >much better off just encrypting with gpg. > >So am I right in thinking that 'sane' files encryted using the --rsyncable >flag will do well - but not as well as those where rsyncrypto controls the >compression becuase it is able to align the block boundaries - or is it that >the file will be double compressed? Or something else? > > It's the double compress. There is no difference between running "gzip --rsyncable" and what rsyncrypto does. In fact, you can use the dummy gzip available inside the sources to eliminate the compression altogether, and then you can run gzip --rsyncable yourself, and you will get the exact same behavior. Shachar -- Shachar Shemesh Lingnu Open Systems Consulting http://www.lingnu.com/ |