The specification says the following: If the value of dictionary size in properties is smaller than (1 << 12), the LZMA decoder must set the dictionary size variable to (1 << 12).
The C++ code follows this, but C# and Java versions have a separate check field, which is not set to the minimum size correctly. This only affects the rep0 >= m_DictionarySizeCheck
check.
The bug is in the decoder SetDictionarySize
.
For C# version I think the function should look like this:
- Remove the m_DictionarySizeCheck
field (I don't really see why it's necessary to be separate)
void SetDictionarySize(uint dictionarySize)
{
if (m_DictionarySize != dictionarySize)
{
m_DictionarySize = Math.Max(dictionarySize, (1 << 12));
m_OutWindow.Create(m_DictionarySize);
}
}
I'm thinking the m_DictionarySizeCheck = Math.Max(m_DictionarySize, 1);
line was supposed to be m_DictionarySizeCheck = Math.Max(m_DictionarySize, 1 << 12);
C# and Java code of LZMA is old (2005-2008). And it was not updated from that time.
The
(1 << 12)
limit was selected later forC
code (for speed optimization) and later it was inserted to LZMA specification.I'm not sure that I want to change old C# and Java now.
The window is created correctly with the limit in mind, it's just that the
rep0
check can fail looking at the code.Is it possible for it to fail in practice?
If encoder writes correct value to properties, then all decoders (C C#, JAVA) must work same way for non-corrupted streams (as I suppose).
If a dict size is always over 1<<12 in the compressed stream, then yeah.
Smaller dictionary is not problem also.
All decoders will unpack correct streams.
The difference for corrupted streams only.