Porting onto powerpc embedded target

SOMS
2011-11-23
2013-05-30
  • SOMS
    SOMS
    2011-11-23

    Question to Author/Experts,

    The utility is great windows and linux.  I was trying to port this onto the target plartforms like powerpc with memory constraint as 2MB RAM.  I think its failing to compress.

    What is the minimum memory requirement ?
    I read that you just need 4kilo bytes for compress towards dictionary . 
    No dictionary memory requirement for uncompression algorithm.

    Is this library suitable for memory constrained target boards eg., 100Mhz with 2MB RAM. 

    Any quick thoughts or suggestions? 

    Appreciate your sincere response.

    Regards,
    azss

     
  • Lasse Collin
    Lasse Collin
    2011-11-24

    Do you need compression, or is decompression support enough?

    If decompression is enough, XZ Embedded is a good choice. The code is 8-20 KiB. The decompressor needs about 30 KiB memory to decompress complete .xz streams from one memory buffer into another memory buffer. If you cannot keep both input and output in RAM at once, it will need 30 KiB + dictionary size bytes of memory.

    LZMA SDK from 7-zip.org is also a good choice. It has similar memory requirements.

    If you also need compression, both liblzma from XZ Utils and LZMA SDK can work with 2 MiB RAM. liblzma code is a bit big but you can reduce it a little with configure options. Still, LZMA SDK will probably have smaller code size. Test compression with 64 KiB dictionary size and HC4 match finder, or 32 KiB dictionary and BT4 match finder. Those should work with 2 MiB RAM.

    It is possible to reduce compressor memory usage by about 200 KiB if you switch to fast mode (LZMA_MODE_FAST in liblzma) *and* edit the source code so that certain structures don't get allocated. Both liblzma and LZMA SDK allocate the structures needed for the normal mode even when compressing in fast mode. Since the fast mode doesn't need those structures, omitting them saves RAM. But it takes some extra work to comment them and their initializations out.

     
  • SOMS
    SOMS
    2011-11-24

    Larhzu Thank you for your response.
    I would like to do both i.e., compression and decompression on the target. 
    And I will try your suggestions.

     
  • Lasse Collin
    Lasse Collin
    2011-11-24

    Try also HC3, BT2, and BT3 match finders. They use less RAM. For example, BT2 and 64 KiB dictionary should work while the same dictionary with BT4 might use too much.

     
  • SOMS
    SOMS
    2011-12-07

    I am trying to read a compressed file, would like read exactly 200 uncompressed bytes in a loop,

    stream.next_out =mybuff
    stream.avail_out = mylen
    stream.next_in = cmpressbufferpointer
    stream.avail_in = cmpressbufferlen

    I observed that once we do lz_code(&stream, LZMA_RUN).  I see avail_out is set to 0 which make sense like it finished decoding.
    What is correct way to read a compressed file,  I mean how to calculate the out buffer required or can we limit number of  uncompressed bytes.  Currently my input buffer lengh is 7 times bigger than compressed buffer len.

    I have another question, what is the correct way to set dictionary size and filter.  I am using the 3 api's from the library to compress and uncompress files.

    ie., lzma_easy_encoder,
    lz_code,
    lz_end

    Temporarily i had inserted one line of code  to set the dictionary size right after lzma_encoder_init insider the lzma_easy_encoder function.  Performance is better as by default it uses 8MB. Now i see its takes about 1.5MB for encoding and 800KB for decoding.

    Any suggestions?

    Thanks,
    AZSS

     
  • SOMS
    SOMS
    2011-12-07

    I tried to uncompress on 400KB file, i see the decoder needs approx 10MB to uncompress. will it help setting dictionary size?

    Any suggestions on this and above question.

    Regards,
    AZSS
    allocated.. 88
    allocated.. 1400
    allocated.. 320
    allocated.. 112
    allocated.. 216
    allocated.. 4272
    allocated.. 184
    allocated.. 28352
    allocated.. 8388608

     
  • Lasse Collin
    Lasse Collin
    2011-12-08

    I'm not sure if I understood your first question. If you want to get 200 uncompressed bytes at a time, set avail_out to 200 and next_out to point to buffer of at least 200 bytes. Set those variables again and call lzma_code again to get more data.

    With lzma_easy_encoder you can affect the dictionary size indirectly by setting a different preset. If you want to set the dictionary size directly, you need to use lzma_stream_encoder:

    lzma_options_lzma opt; // from lzma/lzma.h
    lzma_lzma_preset(&opt, LZMA_PRESET_DEFAULT);
    opt.dict_size = 512 * 1024;

    lzma_filter filters; // from lzma/filter.h
    filters.id = LZMA_FILTER_LZMA2;
    filters.options = &opt;
    filters.id = LZMA_VLI_UNKNOWN;

    lzma_stream strm = LZMA_STREAM_INIT;
    if (lzma_stream_encoder(&strm, filters, LZMA_CHECK_CRC32) != LZMA_OK) {
        // Handle error
    }

    Then continue with lzma_code like you did before.

    The LZMA2 decompressor memory usage depends on the dictionary size that was used when compressing the file. So yes, setting the dictionary size will reduce the memory usage, but you need to do it when compressing the file. Using a dictionary bigger than the uncompressed size of the file is waste of memory.

     
  • SOMS
    SOMS
    2011-12-14

    Could you please confirm on the following code,
    1) when a pointer is null why are you freeing the buffer? (i == NULL, lzma_free(i,..)
    2) when a pointer is null how you are accessing its members under lzma_stream_end?
    3) What is the value of allocator, some of the compiler does like structure.free(buf, alloc). does it has any serious impact if we try to use just free(ptr); instead of allocar.free(ptr)
    Appreciate your comments.
    Regards,
    AZSS
    extern LZMA_API(lzma_index *)
    lzma_index_init(lzma_allocator *allocator)
    {
    lzma_index *i = index_init_plain(allocator);
    index_stream *s = index_stream_init(0, 0, 1, 0, allocator);
    if (i == NULL || s == NULL) {
    index_stream_end(s, allocator);
    lzma_free(i, allocator);
    }

    index_tree_append(&i->streams, &s->node);

     
  • Lasse Collin
    Lasse Collin
    2011-12-15

    That function had a bug. It was fixed a few months ago, but after 5.0.3 release. The fix will be in 5.0.4.

    The allocator simply allows replacing malloc() and free() with custom versions. It's documented in base.h starting from the line 325. Normally the allocator is NULL and malloc() and free() will be used. I copied the idea of custom allocators from the zlib API. I think it was a mistake, but the custom allocator support cannot be removed without breaking the API and ABI.

     
  • SOMS
    SOMS
    2011-12-30

    Question:
    I had set the dictionary to 16KiB and I am not setting the filter option.
    Compression and uncompression works good on the host perfectly.
    On the target the uncompression works with the following error. The uncompressed file is in good shape with no data loss.  To my understanding the buffers are not updating correctly.  Any quick thoughts something like flushing?
    lzma_code error: 9
    lzma_code error: 11

    Regards,
    AZSS

     
  • Lasse Collin
    Lasse Collin
    2012-01-04

    I don't have much ideas. Do you use identical code on both systems? Did you use normal configure && make to build liblzma for the target system, or did you create a custom build system (like a custom makefile)?

     
  • SOMS
    SOMS
    2012-01-05

    Yes.  Both are identical makefiles and  both are custom makefiles.

     
  • SOMS
    SOMS
    2012-01-05

    Question:
    Is there any API that tells how much memory is required to uncompress just by reading header.
    I guess when you compress a file with huge dictionary, during uncompression it might create that big dictionary?

     
  • Lasse Collin
    Lasse Collin
    2012-01-10

    It might be an endianness issue. config.h must define WORDS_BIGENDIAN on a big endian system.

    There's no good API to get file information like decompressor memory requirements or uncompressed size. It should be added. Other people have asked for it too. "xz -lvv foo.xz" can show that information, but it needs more work to make such code fit into liblzma.

    There's a fairly simple hack (with some limitations) that you can use though:

    uint8_t in[4096];
    size_t in_size = fread(in, 1, sizeof(in), input_stream);
    if (in_size == 0) handle_error();
    size_t in_pos = 0;
    size_t out_pos = 0;
    uint64_t mem = 96 << 10;
    lzma_ret ret = lzma_stream_buffer_decode(&mem, 0, NULL, in, &in_pos, in_size, NULL, &out_pos, 0);
    

    If it returns LZMA_OK, it is a .xz file with no data inside. If it returns LZMA_BUF_ERROR, the memory requirement is less than 96 KiB. If it returns LZMA_MEMLIMIT_ERROR, the memory requirement is over 96 KiB and has been stored in the mem variable.

    One limitation of this hack is that it checks only the first .xz Block. Usually that doesn't matter because if there are multiple Blocks, they tend to have the same memory requirements in practice.

    It would be better to do this with the multi-call API (lzma_stream_decoder and lzma_code). It would work with concatenated empty streams (which is an unusual but possible situation), and it would keep working also if future .xz format and liblzma  versions get metadata support, which can push the relevant header beyond the first 4096 bytes. I used lzma_stream_buffer_decode above because it made the example simpler and might be enough for you.

    Note that liblzma isn't very accurate with memory usage calculation. I have tried to make it so that the real usage is unlikely to exceed the calculated value.

    Decompressor memory usage depends on the dictionary size. If you compress with 8 MiB dictionary, the decompressor will allocate 8 MiB dictionary too.

     
  • SOMS
    SOMS
    2012-01-20

    I still get the LZMA_DATA_ERROR.  It fails inside block_decoder.c.
    The uncompressed file looks to be good and everytime it happens in the last final block.
    Compressed file on the intel host using the following method. 
    xz -z -lzma2=dict=16KiB -check=crc32
    In another case compression is done onthe target by setting filter to LZMA_FILTER_LZMA2, dictionary to 16Kib and it fails under the following line.
    if (lzma_check_is_supported(coder->block->check)
    && memcmp(coder->block->raw_check,
    coder->check.buffer.u8,
    check_size) != 0)
    I think i am missing some bit setting under config file.
    FYI: I had already taken take of above suggestion on endianness.
    Any suggestions?

     
  • Lasse Collin
    Lasse Collin
    2012-01-20

    That's exactly where I thought an endianness issue would appear if the endianness has been set incorrectly. Did you try compressing on the target system and then decompress that file on x86 system?

    The only other setting that could have some effect is TUKLIB_FAST_UNALIGNED_ACCESS. It should be fine on both x86 and PowerPC, although I have heard that there are some (older, possibly non-standard) embedded PowerPC CPUs that don't support unaligned access.

    You could try to print (as hex) the first check_size bytes of coder->block->raw_check and see how they differ between PowerPC and x86 when decompressing the same file.

    If you still don't find the problem, try disabling compiler optimizations. If that helps, there might be a bug in the compiler but it is also possible that there's a bug in XZ Utils.

     
  • SOMS
    SOMS
    2012-01-23

    Question:
    When i tried to print the data (i guess its footer block) looks to have byte swapping issue(I guess so).

    if (lzma_check_is_supported(coder->block->check)
    && memcmp(coder->block->raw_check,
    coder->check.buffer.u8,
    check_size) != 0)

    return LZMA_DATA_ERROR;

    x86 (debug output)

    (gdb) p coder->block->raw_check
    $1 = "pÕ\233\200", '\000' <repeats 59 times>
    (gdb) p coder->check.buffer.u8
    $2 = "pÕ\233\200", '\000' <repeats 59 times>
    (gdb) p check_size

    PPC (debug output)
    (gdb) p coder->block->raw_check
    $14 = "\200\233Õp", '\000' <repeats 59 times>
    (gdb) p coder->check.buffer.u8
    $15 = "####", '\000' <repeats 59 times>

    Now, i would like to know what could casue this encoding even after you have taken care on target flags related to big endian while compression like BIG_ENDIAN under config.h  What other flags are related to encoding.  I see same problem if i encode on X86, decode on PPC (Big Endian) target or vice versa.

    Appreciate your comments/suggestions.

    Regards,
    AZSS

     
  • SOMS
    SOMS
    2012-01-26

    Please ignore my request.  I found the issue.
    My apology, In my config i need set the SIZE_MAX to correct value like 32BIT MAX. That has taken care.