Menu

#30 p7zip fails to decompress valid bzip2 file

Unstable (example)
open
nobody
None
5
2020-05-08
2016-02-14
No

Hi,

I am forwarding Debian bug report https://bugs.debian.org/639409 below. I've just done a quick test and it looks like the patch from the bug report fixes the issue. Could you please evaluate it, and possilby include in next releases of p7zip?

From: Mikolaj Izdebski zurgunt@gmail.com
To: Debian Bug Tracking System submit@bugs.debian.org
Subject: p7zip-full: fails to decompress valid bzip2 files
Date: Fri, 26 Aug 2011 23:34:16 +0200

Package: p7zip-full
Version: 9.20.1~dfsg.1-3
Severity: normal
Tags: patch

Embedded bzip2 decoder doesnt support bzip2 blocks with more than 18002
selectors (bzip2 allows up to 32767 selectors).

How to reproduce:

$ (echo QlpoOTFBWSZTWT75kEwAAAJGAAAQAgAMAC//4AAAAAAAAAAAAAAAAAAAAA \

&& echo xxxxxxxxxxxxxxxxxxxxZpoM00Zl4u5IpwoSB98yCY) \ | sed s/x/xxxxxxxxxxxxxxxx/g \ | sed s/x/AAAAAAAAAAAAAAAAA/g \ | base64 -d >payload.bz2
$ bzcat payload.bz2
TEST
$ bzip2 -tvvv payload.bz2
payload.bz2:
[1: huff+mtf rt+rld {0x3ef9904c, 0x3ef9904c}]
combined CRCs: stored = 0x3ef9904c, computed = 0x3ef9904c
ok
$ 7za x payload.bz2

7-Zip (A) [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale=pl_PL.utf8,Utf16=on,HugeFiles=on,4 CPUs)

Processing archive: payload.bz2

Extracting payload Data Error

Sub items Errors: 1

1 Attachments

Discussion

  • Igor Pavlov

    Igor Pavlov - 2016-02-15

    Please attach "bad" bz2 file that is not supported by 7-zip.

     
  • Robert Luberda

    Robert Luberda - 2016-02-16

    Attaching the payload.bz2, that gives "ERROR: Data Error: payload" when decompressing with 7za version 15.09

     
  • Igor Pavlov

    Igor Pavlov - 2016-02-16

    I can't download payload.bz2 file.
    Try to upload again.

     
  • Philippe Ombredanne

    @igor do you still have an issue with downloading the attached bz2? I could get it alright.

     
  • Igor Pavlov

    Igor Pavlov - 2016-04-17

    Yes, please upload it again.

     
  • Sam Tansy

    Sam Tansy - 2020-04-28

    I know it was long time ago but as this is still an issue I will try to upload it. This script is given in a way that is not exactly easy to decode.
    Here is base64 encoded payload.bz2.gz:

    PAK="H4sIAAAAAAACA3OKyrA0dIxUC460+znBh4GByY2BQYCJgYdB//8DhuEORsEoGAWjYBSMglEw\
    CkbBKBgFo2AUjIJRwJa54KxJWpzek66ChQq1nxVmAABZl5ZRLBAAAA=="
    
    echo $PAK | base64 -d | gzip -dc > payload.bz2
    

    To decode it one needs run above script or download and gunzip attachment.

    @Robert Luberda: I'm not sure whether simply allowing 32767 as they are not going to be used in format anyway.Bip2-1.0.8 alows them but discards excess (1.0.8 release) but this belongs to devs.

     
  • Igor Pavlov

    Igor Pavlov - 2020-05-01

    So is it artificial file?
    Actually I don't like that code in Bzip2-1.0.8 decoder, but if it allows such archives, maybe 7-Zip also can allow it.

     
  • Sam Tansy

    Sam Tansy - 2020-05-08

    It is artificial file but rationale behind that was that lbzip2 uses 18008 MFT selectors

    // @ LBZIP2_SRC/encode.c
      union {
        struct {
          uint8_t selector[18000 + 1 + 1];
          uint8_t selectorMTF[18000 + 1 + 7];
          uint32_t num_selectors;
          uint32_t num_trees;
          (...)
    

    when bzip2 format 18002:

    // @ $BZIP2_SRC/bzlib_private.h
    #define BZ_G_SIZE   50
    #define BZ_MAX_SELECTORS (2 + (900000 / BZ_G_SIZE))
    (...)
    

    As said before:

    The nSelectors relaxation is probably something we want to get out asap, so
    people can unbzip2 all files again they could before (even if they were
    technically "broken").

    They bzip2 team, or at least Julian Seward, don't like it either. It's done so such archives can be decompressed. And it's not like they use htem as they basically discard excess.

    It's a question to lzbip2 devs why they violated format rather than to bzip2 team.

     

    Last edit: Sam Tansy 2020-05-08
  • Igor Pavlov

    Igor Pavlov - 2020-05-08

    The things are more complcated.
    There was old bug in bzip2 code.
    lbzip2 used that bug for some extra features.
    Then the bug was fixed in bzip2, but they extended the decoder to support extended archives created by lbzip2.
    But they extended bzip2 decoder even more than required by lbzip2. So now bzip2 decoder is super extended, and that over-lbzip2 is not supported by current 7zip.

     
  • Sam Tansy

    Sam Tansy - 2020-05-08

    Out of curiosity where did you find that? I mean bzip2 1.0.6 is like 10 yeayrs old and lbzip2 was in craddle back then...

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.