7zip doesn't properly handle tar archives that contain pax extended headers. If 7z is used for modifying an archive that contains a pax header, it will most likely produce an archive that will be considered as damaged by most modern tar libraries ( incl. libarchive/bsdtar and python's tarfile module ).
The main problem here is that 7zip does not preserve the original physical ordering of files in the tar archive. POSIX.1-2001 compliant tar archivers can add pax extended headers before a set of files, and the header's attributes apply to that set of files. To non-compliant archivers, this header appears as a regular file. Unfortunately, when 7zip updates an archive, it often relocates the header to the end of the file. Since most modern tar archivers expect the header to appear before a set of files, they will consider the resulting tar as damaged.
One fix is to sort the update items: Patch for p7zip v9.20
To reproduce this issue (using p7zip v9.20):
# git-archive stores the commit-id in a global extended pax header
# The pax header is the first entry at this point
wget https://github.com/mxcl/homebrew/tarball/1da2043d4db6a26fe68539ff7f3ac8fe812617e1 -O - | gunzip > test.tar
# Delete a file inside the archive using p7zip
# The pax header will be the last entry after the update
7z d test.tar mxcl-homebrew-1da2043/README.md
# Try extracting it with bsdtar : damaged tar archive
bsdtar -xf test.tar
# Try listing contents with Python (v2.7) : throws tarfile.ReadError: missing or bad subsequent header
python -c "import tarfile; tarfile.TarFile('test.tar').list()"
I can fix that problem.
Is it OK, if 7-Zip will sort files for TAR to original order only if there is item with typeflag value of "x" or "g"?
Yes, that should be a valid optimization. Otherwise, as far as I know, the ordering isn't critical.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.