Debian ar archives are not extracted correctly
A free file archiver for extremely high compression
Brought to you by:
ipavlov
See http://ftp.osuosl.org/pub/ubuntu/pool/main/a/adduser/adduser_3.112ubuntu1_all.deb
Using 7zip 9.36beta
There should be 3 files in them:
9242 Jan 27 2010 control.tar.gz
110198 Jan 27 2010 data.tar.gz
4 Jan 27 2010 debian-binary
Zip only show one file data.tar
Note that FWIW the size of the extracted file seem to be exactly 256000 which is a conspicuous number alright
Last edit: Philippe Ombredanne 2015-01-22
It's feature to reduce one step in most cases.
You can open top level via context menu:
or from command line
BTW the 256000 size is correct for data.tar: data.tar processing may be ok
but these two files where entirely missed from the extraction:
control.tar.gz
debian-binary
And for reference https://sourceforge.net/p/p7zip/bugs/141/ is the corresponding bug on Linux
Igor: my bad: with t* we are getting the control file.
we are still missing the debian-binary file:
On Linux:
~/tmp/deb$ 7z l adduser_3.112ubuntu1_all.deb -t*
On windows:
"c:\Program Files\7-Zip\7z.exe" l -t* adduser_3.112ubuntu1_all.deb
debian-binary doesn't contain any important information.
So 7-Zip ignores it.
Igor: I think this rather dangerous for 7z to decide if the data contained in a file is important or not and this is a bug in my book.
The debian-binary contains the version of the debian format and is part of the ar achrive alright.
with ar I get
2.0
Last edit: Philippe Ombredanne 2015-01-22
There are many data in different formats that is not shown in listing.
I just considered that small file (4 bytes) as header of deb format. Note that almost all deb files (maybe all?) contain "2.0" in debian-binary. So 7-Zip doesn't show that file.
How can you use these 4 bytes from debian-binary file?
Last edit: Igor Pavlov 2015-01-22
This file is essential as it identifies an ar archive as being a deb file.
Per wikipedia http://en.wikipedia.org/wiki/Deb_%28file_format%29#Implementation ::
Per debian http://manpages.debian.org/cgi-bin/man.cgi?query=deb&manpath=unstable::
7-Zip extracts files from debain package. Headers and service file are not important for most cases.
Can you write why do you need that file at 7-Zip extracting?
Igor you wrote:
Headers and service file are not important for most cases.
Well, I want to extract debian files reliably. Meaning that I expect all the files that are in an original archive to be extracted, and that there is nothing that should be ignored by 7zip based on some criteria, regardless how small or mundane.
It should not be up to 7zip to decide for me if a file is not important.
Otherwise I cannot really trust 7zip for extracting debian ar files and may be other files.
Again the debian-binary file is what makes an ar file a deb file and is important even if small as it separates a plain ar from a deb.
If I extract some arbitrary ar file, that will tell me that this was a debian archive, even if it may not have a .deb extension.
Last edit: Philippe Ombredanne 2015-01-23
Note, that there are two different things:
1) extracting AR archive
2) extracting DEB archive
When we extract DEB archive. then some archive files in AR archive are just headers for DEB archive data. And we don't need these headers at most cases. We need only important data.
Actually that problem doesn't look to important for me now.
So I don't want to change that code.
I cannot understand why you find this bug not important.
But this is your code so I do not have to understand, only accept ... that said would you accept a patch at least?
The fix needed seems to be only to delete some code to avoid any special handling of deb files, something like that (un-tested though, do you have a test suite? I could not find it in the source drops)
Cordially
Philippe
diff --git a/CPP/7zip/Archive/ArHandler.cpp b/CPP/7zip/Archive/ArHandler.cpp
index b7dcda8..ac8e49b 100644
--- a/CPP/7zip/Archive/ArHandler.cpp
+++ b/CPP/7zip/Archive/ArHandler.cpp
@@ -614,16 +614,6 @@
if (!_items.IsEmpty() && _items[0].Name == "debian-binary")
{
_type = kType_Deb;
- _items.DeleteFrontal(1);
- for (unsigned i = 0; i < _items.Size(); i++)
- if (_items[i].Name.IsPrefixedBy("data.tar."))
- if (_mainSubfile < 0)
- _mainSubfile = i;
- else
- {
- _mainSubfile = -1;
- break;
- }
}
else
{
again:
It's not bug.
It's feature, that hides the file that is treated as header.
if you can write any argument, why that service file is useful, I can think about changing the code.
I suppose that users just need to extract files from DEB data.tar in 99% cases. in 1 % case maybe they want to look control.tar. And maybe in 0.01% cases they want to look that debian-binary.
Last edit: Igor Pavlov 2015-01-23
My educate guess would be that if you are extracting files inside a package individually (rather than, you know, normally letting everything be handled by the OS's manager), then chances are that 99% is an overestimation of "normal users".
I, for one, was quite interested into checking dependencies used in different repositories for example (and was pretty much confused, before I noticed file-roller was showing quite more)
We could discuss all day whether a file-header is effectively worth then, but I don't see how in the heaven a totally normal file is to be dumped
In general it must work so:
When 7-Zip sees service file debian-binary, then it doesn't think that archive is AR type. It thinks that archive is AR:DEB.
And that type is fixed for future operations. For example, if update operation is supported for such archive, you are not allowed to change / delete that service file and so on. You can think about that file as part of DEB format header.
Maybe sometimes we want to open DEB archives in raw AR format, that will show that file. But that feature is not implemented now. 7-Zip opens archives in most deep subtype, if subtype is detected.
Last edit: Igor Pavlov 2015-01-23