Menu

Home

Ralf Brown

Tool to recover data from corrupted ZIP archives and DEFLATE-compressed streams from other files and disk images.


Project Admins:


Discussion

  • Hunter

    Hunter - 2019-06-22

    My laptop shut down in the middle of a GIt commit and corrupted a handful of files in the repository. I read some of your research and and believe this software could be very helpful for recovering some of these files, but am having problems compiling it. I know this is a few years old but would you be willing to explain some of the source code to hopefully help with these compilation errors?

     
    • Ralf Brown

      Ralf Brown - 2019-06-22

      Sure, I can try to help you get the program compiled. What is your
      environment (OS, compiler, etc)?

      On Sat, Jun 22, 2019 at 4:11 PM Hunter
      formula-hunter@users.sourceforge.net wrote:

      My laptop shut down in the middle of a GIt commit and corrupted a handful of files in the repository. I read some of your research and and believe this software could be very helpful for recovering some of these files, but am having problems compiling it. I know this is a few years old but would you be willing to explain some of the source code to hopefully help with these compilation errors?


      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/ziprec/wiki/Home/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

       
      • Hunter

        Hunter - 2019-06-23

        OS --
        Windows 10 Pro
        v 1803
        build 17134.829

        compiler --
        gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)

        Ubuntu bash --
        Description: Ubuntu 18.04.2 LTS
        Release: 18.04
        Codename: bionic
        kernel/version: 4.4.0-17134-Microsoft #706-Microsoft Mon Apr 01 18:13:00 PST 2019

        I went ahead and made a few changes based on make compiler errors:
        - Added explicit casts (char andunsigned char as necessary) in frctype2.C, frhtml.C, and frreader.C due to type conversion errors
        - Added #include <unistd.h> to resolve undeclared methods in frlocate.C, frmem.C, and frcpfile.C due to wnarrowing
        - Changed false return values to nullptr return values in frclient.C due to type coversion errors

        With these band-aids, ziprec compiled successfully (I also generated a .lang file using mklang). However, I'm not sure they were all appropriate as the program is causing segmentation fault errors. Any suggestions on alternate fixes for these compiler errors or direction on how to diagnose the segfaults?

         
        • Ralf Brown

          Ralf Brown - 2019-06-25

          The segfault in mklang turns out to be due to a new optimization based
          on the assumption that 'this' will never be NULL (which my code
          violates when the linked list is empty). That breaks the implicit
          guard of the while loop in WordList::eraseList()....

          Adding guards around those calls in main() in mklang.C fixes the
          segfault. You can also compile with "make DEBUG=1" as a workaround,
          since that disables the problematic optimization.
          e.g.
          if (frequencies)
          {
          frequencies->eraseList() ;
          frequencies = nullptr ;
          }
          if (words)
          {
          words->eraseList() ;
          words = nullptr ;
          }

           
  • ggrcrsh

    ggrcrsh - 2024-08-03

    Hi!
    I'm cutting start of .gz file, and started recover it, using

    ./ziprec -g file.gz
    

    I'm getting segfault.
    On whole file getting it too.
    And on cut and whole .zip files getting it too, on reverse function, same as .gz file

    Stack trace from gdb here:

    #0  0x000055555557e792 in Fr::BitReverser::reverse<unsigned int> (numbits=2, N=3) at ./framepac/framepac/bits.h:48
            mask = <optimized out>
            high = <optimized out>
            extra = <optimized out>
            mask = <optimized out>
            high = <optimized out>
            extra = <optimized out>
    #1  BitPointer::nextBitsReversed (this=this@entry=0x7fffffffcd60, num_bits=2) at bits.C:117
            bits = 3
    #2  0x0000555555583768 in HuffmanTree::nextSymbol (this=<optimized out>, ptr=..., str_end=..., symbol=@0x7fffffff6ee6: 0) at huffman.C:182
            next_bits = <optimized out>
            sym = <optimized out>
    #3  0x000055555557c680 in HuffSymbolTable::nextSymbol (symbol=@0x7fffffff6ee6: 0, str_end=..., pos=..., this=0x7fffffff6f80) at symtab.C:297
    No locals.
    #4  decode_bit_lengths (lit_count=lit_count@entry=262, lit_lengths=..., dist_count=dist_count@entry=26, dist_lengths=..., bit_tab=bit_tab@entry=0x7fffffff6f80, pos=..., str_end=...) at symtab.C:106
            bit_length = 0
            copy_count = <optimized out>
            len = <optimized out>
            i = <optimized out>
            lengths = 0x7fffffff8e40
            prev_length = 0
            count = 288
            adj = 0
    #5  0x000055555557dd87 in HuffSymbolTable::build (pos=..., str_end=..., deflate64=deflate64@entry=false) at symtab.C:265
            num_lit_codes = 262
            num_dist_codes = 26
            num_len_codes = <optimized out>
            bit_lengths = {m_counts = {10, 0, 2, 2, 0, 1, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0}, m_symbols = {0 <repeats 240 times>, 3, 12, 0 <repeats 238 times>, 16, 18, 0 <repeats 238 times>, 6, 9, 17,
                0 <repeats 237 times>, 9, 10, 0 <repeats 238 times>, 8, 6, 0 <repeats 238 times>, 7, 4, 0 <repeats 239 times>, 11, 17, 10, 14, 0 <repeats 2155 times>}}
            lengths = {7, 0, 0, 0, 0, 0, 3, 6, 5, 3, 0, 7, 0, 0, 0, 0, 2, 7, 2}
            bit_tab = {static allocator = 0x555555903f10, m_lengthtable = 0x7fffffff7000, m_codetree = {m_item = 0x555555909420}, m_distancetree = {m_item = 0x5555559093b0}, m_eod = {m_value = 0,
                m_length = 0 '\000'}, m_deflate64 = false}
            lit_lengths = {m_counts = {0 <repeats 16 times>}, m_symbols = {0 <repeats 3840 times>}}
            dist_lengths = {m_counts = {0 <repeats 16 times>}, m_symbols = {0 <repeats 2604 times>, 52645, 63484, 32767, 0, 0, 0, 0, 0, 32768, 2, 0, 0, 32736, 2, 0, 0, 32736, 2, 0, 0, 4096, 0, 0, 0, 0, 0, 0,
                0, 1, 0, 0, 0, 32768, 2, 0, 0, 53248, 27, 0, 0, 49985, 27, 0, 0, 49985, 27, 0, 0, 4096, 0, 0, 0, 32768, 2, 0, 0, 5, 0, 0, 0, 53248, 27, 0, 0, 20480, 33, 0, 0, 18724, 33, 0, 0, 18724, 33, 0,
                0, 4096, 0, 0, 0, 53248, 27, 0, 0, 1, 0, 0, 0, 24576, 33, 0, 0, 49152, 33, 0, 0, 47240, 33, 0, 0, 36432, 34, 0, 0, 4096, 0, 0, 0, 20480, 33, 0, 0, 3, 0 <repeats 263 times>, 2318, 63485,
                32767, 0, 53312, 63487, 32767, 0, 51376, 65535, 32767, 0, 64, 63360, 32767, 0, 51982, 63484, 32767, 0, 6, 0, 4, 0, 64, 0, 0, 0, 64, 0, 0, 0, 64, 0, 0, 0, 784, 0, 0, 0, 784, 0, 0, 0, 8, 0, 0,
                0, 3, 0, 4, 0, 15920, 30, 0, 0, 15920, 30, 0, 0, 15920, 30, 0, 0, 28, 0, 0...}}
            symtab = {m_item = 0x0}
    #6  0x0000555555586a55 in valid_packet (deflate64=false, exact_bit=<optimized out>, final_packet=true, str_end=..., str_start=..., pos=...) at inflate.C:792
            position = {m_byteptr = 0x7ffff7ffa28e "_\275\361D\375\002MN\225\064_\005", m_bitnumber = 4 '\004'}
            symtab = {m_item = 0x0}
            valid = <optimized out>
            hdr = <optimized out>
            is_last = <optimized out>
            hdr = <optimized out>
            is_last = <optimized out>
            position = <optimized out>
            symtab = <optimized out>
            valid_EOD = <optimized out>
            valid = <optimized out>
            byte_offset = <optimized out>
            bit_number = <optimized out>
            position = <optimized out>
            symtab = <optimized out>
            valid = <optimized out>
            valid_EOD = <optimized out>
            byte_offset = <optimized out>
            bit_number = <optimized out>
    #7  find_packet_start (deflate64=false, exact_bit=<optimized out>, final=true, base_offset=0, str_end=..., str_start=..., str_pos=...) at inflate.C:999
            pos = {m_byteptr = 0x7ffff7ffa288 "-\371t\275ހ_\275\361D\375\002MN\225\064_\005", m_bitnumber = 0 '\000'}
            start = {m_byteptr = 0x7ffff7ffa000 "\200\224\370\202\270 E\325ƞث8\273aw]\327G\227\272%(mP\222\n\345\310\377\343G1㴨\244TT\342b\257f\236\337{\363\261\036\217\241\337'\210\005!\232\350\060\241\062z\375\363\207\305\\\207\240SO\030\217\241TM\bѕ\204\306U\036\251\366\311Q(\270\252|\022\243s\362M,\214\315%\rW\333D\251\260\360\244\263\003\276\244<\240\064\063\202\306\071&z\322$*&0\354%\201K^\300>\"3mSB\377\374\027T\315\033\320\305hB!b\352\r\331,H\271\r\330g\206\071}T\243\302d\204\312f\344\017$q\024\n\343\343\211R\f\f\\\214ɋ\b\355\347\001\251\257B_՜\206*7ӈ"..., m_bitnumber = 0 '\000'}
            pos = <optimized out>
            start = <optimized out>
            ptype = <optimized out>
            offset = <optimized out>
            bit_number = <optimized out>
    #8  locate_packets (str_end=..., str_end=..., deflate64=false, base_offset=0, str_start=...) at inflate.C:1253
            ptype = <optimized out>
            packets = <optimized out>
            str_pos = {m_byteptr = 0x7ffff7ffa291 "D\375\002MN\225\064_\005", m_bitnumber = 4 '\004'}
            curr_end = {m_byteptr = 0x7ffff7ffa294 "MN\225\064_\005", m_bitnumber = 0 '\000'}
            exact_bit = <optimized out>
            packets = <optimized out>
            str_pos = <optimized out>
            curr_end = <optimized out>
            exact_bit = <optimized out>
            ptype = <optimized out>
    #9  recover_stream (params=..., fileinfo=0x7fffffffd650, outfp=..., outfile=0x555555909190 "./gzipdata-00000000.dat", stream_start=0x7ffff7ffa000 "\200\224\370\202\270 E\325ƞث8\273aw]\327G\227\272%(mP\222\n\345\310\377\343G1㴨\244TT\342b\257f\236\337{\363\261\036\217\241\337'\210\005!\232\350\060\241\062z\375\363\207\305\\\207\240SO\030\217\241TM\bѕ\204\306U\036\251\366\311Q(\270\252|\022\243s\362M,\214\315%\rW\333D\251\260\360\244\263\003\276\244<\240\064\063\202\306\071&z\322$*&0\354%\201K^\300>\"3mSB\377\374\027T\315\033\320\305hB!b\352\r\331,H\271\r\330g\206\071}T\243\302d\204\312f\344\017$q\024\n\343\343\211R\f\f\\\214ɋ\b\355\347\001\251\257B_՜\206*7ӈ"..., stream_end=<optimized out>, base_offset=0, known_start=false, deflate64=false, known_end=true) at inflate.C:1619
            timer = {<Fr::TimerBase> = {<No data fields>}, m_start_time = {tv_sec = 0, tv_nsec = 11879910}}
            packet_start = {m_byteptr = 0x7ffff7ffa294 "MN\225\064_\005", m_bitnumber = 0 '\000'}
            last_packet_header = {m_byteptr = <optimized out>, m_bitnumber = 0 '\000'}
            packet_list = 0x0
            success = <optimized out>
            str_start = {m_byteptr = 0x7ffff7ffa000 "\200\224\370\202\270 E\325ƞث8\273aw]\327G\227\272%(mP\222\n\345\310\377\343G1㴨\244TT\342b\257f\236\337{\363\261\036\217\241\337'\210\005!\232\350\060\241\062z\375\363\207\305\\\207\240SO\030\217\241TM\bѕ\204\306U\036\251\366\311Q(\270\252|\022\243s\362M,\214\315%\rW\333D\251\260\360\244\263\003\276\244<\240\064\063\202\306\071&z\322$*&0\354%\201K^\300>\"3mSB\377\374\027T\315\033\320\305hB!b\352\r\331,H\271\r\330g\206\071}T\243\302d\204\312f\344\017$q\024\n\343\343\211R\f\f\\\214ɋ\b\355\347\001\251\257B_՜\206*7ӈ"..., m_bitnumber = 0 '\000'}
            num_packets = <optimized out>
            have_corruption = <optimized out>
            wf = <optimized out>
            fmt = <optimized out>
            decode_buffer = {m_buffer = {m_items = 0x7ffff7ffa28e}, m_filebuffer = {m_items = 0xb54504}, m_context_flags = {m_items = 0x1e}, m_replacements = {m_items = 0x7ffff7fc3950}, m_wildcardcounts = {m_items = 0x7ffff7ffdaf0 <_rtld_global+2736>}, m_infp = {m_file = 0x7fffffffce28, m_tempname = {m_items = 0x7fffffffce24 "\377\177"}, m_finalname = {m_items = 0x7ffff7fbc680 "\340\342\377\367\377\177"}, m_errcode = -134465200, m_piped = 255, m_complete = 127, m_keep_backup = false}, m_outfp = {m_file = 0x85bdb5ef, m_tempname = {m_items = 0x7fffffffce24 "\377\177"}, m_finalname = {m_items = 0x216f6d7 <error: Cannot access memory at address 0x216f6d7>}, m_errcode = -134494536, m_piped = 255, m_complete = 127, m_keep_backup = false}, m_filename = 0x7ffff7fc72c5 <_dl_map_object_deps+2693> "J\307", <incomplete sequence \353>, m_backingfile = {m_items = 0x7fff00000000 <error: Cannot access memory at address 0x7fff00000000>}, m_bufptr = 4152483226, m_refwindow = 32767, m_numreplacements = 140737345768456, m_numbytes = 93824996118928, m_loadedbytes = 3458858338816659880, m_datastart = 1181888791992304128, m_highest_replaced = 1435537808, m_discontinuities = 21845, m_format = WFMT_DecodedByte, m_unknown = 0 '\000', m_deflate64 = false, m_prev_correct = false, m_show_errors = false}
    #10 0x0000555555588af5 in recover_stream (start_sig=start_sig@entry=0x0, end_sig=end_sig@entry=0x7ffff71fefd8, params=..., fileinfo=0x7fffffffd650, filename_hint=filename_hint@entry=0x0, original_size_hint=<optimized out>, known_start=false, deflate64=false, known_end=true) at ./framepac/framepac/smartptr.h:130
            reference_filename = {m_items = 0x0}
            outfp = {m_file = 0x555555908f90, m_tempname = {m_items = 0x0}, m_finalname = {m_items = 0x0}, m_errcode = 0, m_piped = false, m_complete = false, m_keep_backup = true}
            buffer_start = 0x7ffff7ffa000 "\200\224\370\202\270 E\325ƞث8\273aw]\327G\227\272%(mP\222\n\345\310\377\343G1㴨\244TT\342b\257f\236\337{\363\261\036\217\241\337'\210\005!\232\350\060\241\062z\375\363\207\305\\\207\240SO\030\217\241TM\bѕ\204\306U\036\251\366\311Q(\270\252|\022\243s\362M,\214\315%\rW\333D\251\260\360\244\263\003\276\244<\240\064\063\202\306\071&z\322$*&0\354%\201K^\300>\"3mSB\377\374\027T\315\033\320\305hB!b\352\r\331,H\271\r\330g\206\071}T\243\302d\204\312f\344\017$q\024\n\343\343\211R\f\f\\\214ɋ\b\355\347\001\251\257B_՜\206*7ӈ"...
            end_offset = <optimized out>
            start_offset = <optimized out>
            filename = {m_items = 0x555555909190 "./gzipdata-00000000.dat"}
            default_filename = {m_items = 0x0}
            reconst_filename = {m_items = 0x0}
            output_directory = <optimized out>
            success = false
            is_uncompressed = <optimized out>
            using_stdin = <optimized out>
    #11 0x000055555557270c in recover_gzip_span (prev=0x0, curr=0x7ffff71fefd8, params=..., fileinfo=<optimized out>, known_start=<optimized out>) at recover.C:1888
            original_size_hint = <optimized out>
            known_end = <optimized out>
            buffer_start = <optimized out>
            filename_hint = {m_items = 0x0}
    #12 0x0000555555576e7a in recover_files (fileinfo=0x7fffffffd650, params=..., locations=0x7ffff71fefd8) at recover.C:2108
            recovered = <optimized out>
            curr = <optimized out>
            prev = <optimized out>
            success = <optimized out>
            deflate64 = false
            prev = <optimized out>
            success = <optimized out>
            deflate64 = <optimized out>
            curr = <optimized out>
            recovered = <optimized out>
            sig = <optimized out>
            known_end = <optimized out>
            st = <optimized out>
    #13 process_file_data (params=..., fileinfo=0x7fffffffd650, seqnum=@0x7fffffffd55c: 0) at recover.C:2274
            central = <optimized out>
            output_dir = {m_items = 0x555555909170 "."}
            multiples = <optimized out>
            input_file = <optimized out>
            timer = {<Fr::TimerBase> = {<No data fields>}, m_start_time = {tv_sec = 0, tv_nsec = 3844310}}
            success = false
            buffer_start = 0x7ffff7ffa000 "\200\224\370\202\270 E\325ƞث8\273aw]\327G\227\272%(mP\222\n\345\310\377\343G1㴨\244TT\342b\257f\236\337{\363\261\036\217\241\337'\210\005!\232\350\060\241\062z\375\363\207\305\\\207\240SO\030\217\241TM\bѕ\204\306U\036\251\366\311Q(\270\252|\022\243s\362M,\214\315%\rW\333D\251\260\360\244\263\003\276\244<\240\064\063\202\306\071&z\322$*&0\354%\201K^\300>\"3mSB\377\374\027T\315\033\320\305hB!b\352\r\331,H\271\r\330g\206\071}T\243\302d\204\312f\344\017$q\024\n\343\343\211R\f\f\\\214ɋ\b\355\347\001\251\257B_՜\206*7ӈ"...
            buffer_end = <optimized out>
            signatures = 0x7ffff71fefd8
            file_format = <optimized out>
    #14 0x0000555555578916 in recover_file (zipfp=..., params=..., fileinfo=0x7fffffffd650, seqnum=@0x7fffffffd55c: 0) at recover.C:2315
            success = false
            datalen = <optimized out>
            memory_mapped = <optimized out>
            filedata = <optimized out>
    #15 0x0000555555578ce5 in recover_file (params=..., fileinfo=fileinfo@entry=0x7fffffffd650) at recover.C:2349
            zipfp = {<Fr::CFile> = {m_file = 0x555555908d90, m_tempname = {m_items = 0x0}, m_finalname = {m_items = 0x0}, m_errcode = 0, m_piped = false, m_complete = false, m_keep_backup = true}, <No data fields>}
            success = false
            seqnum = 0
            filename = <optimized out>
    #16 0x000055555555e224 in main (argc=2, argv=0x7fffffffd800) at ziprec.C:548
            input_file = 0x7fffffffdc8e "tetete.gz"
            format = <optimized out>
            wordmodel = <optimized out>
            fileinfo = {m_langid = 0x0, m_lengthmodel = 0x0, m_wordmodel = 0x0, m_filename = 0x7fffffffdc8e "tetete.gz", m_output_dir = 0x555555909170 ".", m_orig_output_dir = 0x5555555b9001 ".", m_format = FF_gzip, m_bufferstart = 0x7ffff7ffa000 "\200\224\370\202\270 E\325ƞث8\273aw]\327G\227\272%(mP\222\n\345\310\377\343G1㴨\244TT\342b\257f\236\337{\363\261\036\217\241\337'\210\005!\232\350\060\241\062z\375\363\207\305\\\207\240SO\030\217\241TM\bѕ\204\306U\036\251\366\311Q(\270\252|\022\243s\362M,\214\315%\rW\333D\251\260\360\244\263\003\276\244<\240\064\063\202\306\071&z\322$*&0\354%\201K^\300>\"3mSB\377\374\027T\315\033\320\305hB!b\352\r\331,H\271\r\330g\206\071}T\243\302d\204\312f\344\017$q\024\n\343\343\211R\f\f\\\214ɋ\b\355\347\001\251\257B_՜\206*7ӈ"..., m_bufferend = 0x7ffff7ffa29c "", m_stdin = false}
            argv0 = <optimized out>
            output_directory = <optimized out>
            file_format = FF_gzip
            gzip_by_extension = false
            langid = {m_item = 0x0}
            lenmodel = {m_item = 0x0}
            params = {scan_range_start = 0, scan_range_end = 18446744073709551615, test_mode_skip = 1, test_mode_offset = 0, reconstruction_iterations = 1, write_format = WFMT_PlainText, base_name = 0x5555555b6e60 "gzipdata", junk_paths = false, force_overwrite = false, exclude_PDFs = false, test_mode = false, perform_reconstruction = false, reconstruct_partial_packet = false, reconstruct_align_discontinuities = true, use_word_model = true}
            total_args = 2
            status = 0
    
     

    Last edit: ggrcrsh 2024-08-03

Log in to post a comment.