Re: rsyncable patch

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Tobias Roth wrote:
> Shachar Shemesh wrote:
>
>   
>>> Will rsyncrypto work without an rsyncable gzip? The patch produces a lot
>>> of rejects, and I can't be bothered to try to fix them all.
>>>       
>> That's strange, as gzip has not issued a new version in quite some time
>> (several years at least).
>>     
>
> I count 10 releases in 2007 alone.
There have been several releases in 2007, indeed, but 10 is nowhere near
the right number. gzip version 1.3.5 entered Debian on Oct 30th, 2002. I
find it hard to believe a time machine was employed.

To be specific, here is what gzip's changelog states:
1.3.12 - April 13 2007
1.3.11- Feb 5, 2007
1.3.10 - Dec 30, 2006
1.3.9 - Dec 15, 2006
1.3.8 - Dec 8, 2006
1.3.7 - Dec 6, 2006
1.3.6 - Nov 20, 2006
1.3.5 - Sep 30, 2002

I think you'll understand where my error came from. The program has been
stagnant for over four years.

The patch in the rsyncrypto repository applies cleanly against 1.3.5. I
have no intention of fixing it for newer versions, mostly due to the
fact that I plan on switching to zlib (forked inside the rsyncrypto
source) and rid the users from the need to recompile anything at all.
>  This is of the 1.3 branch though,
> while the patch is against 1.2.4. GZip.org also states that 1.2.4
> suffers from a security issue, which might be the reason my system has
> 1.3 by default.
>   
Which, I might add, is still defined as "beta" :-) It also has a "todo"
which states "Add rsync patch". Pity it didn't happen yet :-(

> Concerning the security issue, while I am not a cryptographer, I wonder
> whether this really has an impact on endurance? Shouldn't the strength
> of a proper encryption algorithm not be independent on the content it
> encrypts?
>   
Yes and no.

First, notice how we define "content" as "the plain text before
compression". In that regard you are actually changing the algorithm
when you skip compression altogether.

Rsyncrypto, by definition, weakens the encryption in order to achieve
rsync friendliness. If that is not what you are after, use pgp. Our #1
vulnerability is the attacker detecting:
1. Where the change began
2. How big the change was
3. Learn something about the plain text by deducting something about the
point in which the decision function triggered.

We do our best to counter 3 by having the decision function be
relatively meaningless (we leak 1 bit which aggregates about 8KB of data
- hardly a great insight). We still find #3 problematic in cases where
the input file has a low entropy, resulting in repetitions making it
into the encrypted file (which is, by definition, what we want from
rsyncrypto, so it's very hard to fix this one). Also, a combination of
partial knowledge by the attacker with low entropy will also lead to
information leaks.

For #2 we mask the point where the change ended by using a decision
function that aggregates a big area (8KB) + having a high standard
deviation (so the attacker cannot know whether it was a 1 byte change +
12KB until the decision function triggered again, or 4KB change + 8KB of
identical data).

For #1, there is really nothing we can do without radical changes. I had
one suggested radical change offered, and I haven't had time to look at
it, but I believe a change that manages to mask #1 will have huge impact
on encryption/decryption performance (currently O(n) and one pass).

Compression prior to encryption makes our lives much better:
1. compresses in blocks. When a change happens, the entire block
changes, from its very beginning. We have further masking of the change
position start.
2. An attacker now has to add the gzip block size entropy to
rsyncrypto's, reducing her knowledge about the change
3. Repeats in plain text will get compressed away, thus reducing the
problems associated with this point. Also, the output of gzip always has
high entropy, thus ensuring that less can be learned about the plain
text from the knowledge that the decision function triggered at that
location. A partial knowledge of the data can still lead the attacker to
be able to deduct things about the plain text, but the information can
be a lot less partial now (attacker needs to know, more or less, all
bytes but 1) and need to span more data (8KB of compressed data, as
opposed to 8KB of uncompressed data before).

In other words, compression prior to encryption has been, more or less,
a part of the design all along. Removing it makes rsyncrypto a different
algorithm, with different weaknesses.
> Thanks a lot,
> Tobias
>   
Shachar