Header corruption on updating tar + pax

Anonymous
2011-09-20
2013-05-28
  • Anonymous - 2011-09-20

    7zip doesn't properly handle tar archives that contain pax extended headers. If 7z is used for modifying an archive that contains a pax header, it will most likely produce an archive that will be considered as damaged by most modern tar libraries ( incl. libarchive/bsdtar and python's tarfile module ).

    The main problem here is that 7zip does not preserve the original physical ordering of files in the tar archive. POSIX.1-2001 compliant tar archivers can add pax extended headers before a set of files, and the header's attributes apply to that set of files. To non-compliant archivers, this header appears as a regular file. Unfortunately, when 7zip updates an archive, it often relocates the header to the end of the file. Since most modern tar archivers expect the header to appear before a set of files, they will consider the resulting tar as damaged.

    Patch
    One fix is to sort the update items: Patch for p7zip v9.20

    To reproduce this issue (using p7zip v9.20):

    # git-archive stores the commit-id in a global extended pax header
    # The pax header is the first entry at this point
    wget https://github.com/mxcl/homebrew/tarball/1da2043d4db6a26fe68539ff7f3ac8fe812617e1 -O - | gunzip > test.tar
    # Delete a file inside the archive using p7zip
    # The pax header will be the last entry after the update
    7z d test.tar mxcl-homebrew-1da2043/README.md
    # Try extracting it with bsdtar : damaged tar archive
    bsdtar -xf test.tar
    # Try listing contents with Python (v2.7) : throws tarfile.ReadError: missing or bad subsequent header
    python -c "import tarfile; tarfile.TarFile('test.tar').list()"
    

    http://pubs.opengroup.org/onlinepubs/009695399/utilities/pax.html#tag_04_100_13_03

     
  • Igor Pavlov

    Igor Pavlov - 2011-09-20

    I can fix that problem.
    Is it OK, if 7-Zip will sort files for TAR to original order only if there is item with typeflag value of "x" or "g"?

     
  • Anonymous - 2011-09-20

    Yes, that should be a valid optimization. Otherwise, as far as I know, the ordering isn't critical.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks