I downloaded the SDK lzma918 and I have few questions:
1. Is all implementations have the same level of functionalities and/or algorithm ?
2. After a quick overview I saw class lzma2Encode in CPP and Java seams have only one algorithm ?
3. If All aren't at the same level I expect the CPP is the most advanced. Correct ?
Thanks for your answers.
1) Latest LZMA SDK is 9.22.
2) Now I update only C and C++ code. C# and Java versions work, but that code is slower than new optimized C/C++ code. And C# and Java versions implement only LZMA method.
Is the C# version being maintained at all?
I'd love to have a pure C# implementation of LZMA2. I'm happy to take a crack at doing it myself if that's what it would take.
From what I understand they are contributions (the C# one being a translation of the Java one) and not actively developed, as you can see from the missing LZMA2 encoding.
I've been working myself at retranslating the C source into C# but I can tell you it's a *lot* of work and it's easy to make mistakes - and then it's hard to figure out where your translation differs from the original. I'm currently at my second attempt and I'm about to scrap it because I don't have good unit tests. I'll soon start a third with structural unit test against the C implementation running in sync, so I know all intermediate states match between the implementations.
What would be interesting is if there are tests for the original C source and if those could be made public, so they can be used to test other implementations as well.
Also, for what it's worth, I do have a translation of LZMA2 to C# which partially (mostly?) works; so if you prefer to work on existing code instead of starting from scratch I can share that to give you a head start.
The main problem with LZMA2 is that it's multithreaded, making it hard to write unit tests which compare two implementations. (In my first attempt I wrote unit tests which only tested functionality and output, but then I was totally lost when it failed because there was no way to figure out why exactly the output differed.)
Just remove multi-threaded code, and use single-thread version.
You can start with decoding code. It's much simpler than encoder and it's single-threaded.
It's not a problem of code complexity, its a problem of detecting typos and other silly mistakes which easily happen when translating a large codebase. Like I said I already have translated the LZMA 2 algorithms to C#. The only reason I'm scrapping my translation is because I made a mistake and can't properly unit-test the error code paths, but I want full confidence that my translation is correct. It's much less error prone to start from scratch than modifying the translation without proper unit tests in place.