What
Is it possible to check if an archive is solid without reading all of it?
Why
We are using the 7z_C sources, to load a sort of extensions for our software. these are packaged with LZMA. each extension has a special file with meta info + some MB of data and code. we often have to make an index of all extensions, and so have to scan all the available extension archives, looking for the meta-info file. there might be hundreds of extensions present. to keep the scan fast, we want to enforce non-solid archives by showing an error if an extension is packed as solid archive.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I did some testing, but i do not understand how i could find out whether an archive is solid or not, using hte properties you suggested (hope i got it right).
I see that the solid archives have less NumPackStreams/NumFolders then the non-solid ones. As in practise, i will only have one of the two, i can not see how this woudl help me to evaluate whether an archive is solid or not.
Did i do it wrong? :D
Side-Note:
It seems like solid is the default for "7z -t7z", which is quite unintuitive, as the documentation only tells you how to enable solid ("-ms=on"), which suggests that off is the default. I may be wrong though; i just saw that "-ms=on" and "" spit out the same values for NumPackStreams/NumFolders/NumFiles, while "-ms=off" gave different results. Might be because i had quite small test archives.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
… as an alternative solution, i saw that partly solid archives are possible (eg, blocks or 512KB solid data).
when archiving, is it possible to have a filter for files to be excluded from solid, or be placed in a separate solid block, and have all the rest in one big solid block?
that would actually be an even better solution then to disallow solid archives.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1) at decompression time:
You find meta-info file, detect what folder it's placed, check size of that folder, and if it's small (< 500 KB), you decompress it.
2) at compression time you can use 2 commands:
7z a a.7z meta-info
7z a a.7z other-files
So meta-info file will be in separate solid block.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
thanks you two! :-)
We will use the two things explained by ipavlov, as they are more flexible and suite us better, and cause the kpidSolid (using CArchiveDatabase::IsSolid() internally), is in the C++ lib only, and we use the C one.
The current (beta) rule we use to check if an extension is well packed, is checking the cost of extracting a meta-file:
What
Is it possible to check if an archive is solid without reading all of it?
Why
We are using the 7z_C sources, to load a sort of extensions for our software. these are packaged with LZMA. each extension has a special file with meta info + some MB of data and code. we often have to make an index of all extensions, and so have to scan all the available extension archives, looking for the meta-info file. there might be hundreds of extensions present. to keep the scan fast, we want to enforce non-solid archives by showing an error if an extension is packed as solid archive.
You can check number of "folders" - number of solid blocks.
I did some testing, but i do not understand how i could find out whether an archive is solid or not, using hte properties you suggested (hope i got it right).
using this stuct from 7z.h:
i get these results for 2 small test content packs:
archiveA_solid.sd7 ("-ms=on")
NumPackStreams: 1
NumFolders: 1
NumFiles: 9
archiveA_non_solid.sd7 ("-ms=off")
NumPackStreams: 3
NumFolders: 3
NumFiles: 9
archiveB_solid.sd7 ("-ms=on")
NumPackStreams: 2
NumFolders: 2
NumFiles: 15
archiveB_non_solid.sd7 ("-ms=off")
NumPackStreams: 8
NumFolders: 8
NumFiles: 15
I see that the solid archives have less NumPackStreams/NumFolders then the non-solid ones. As in practise, i will only have one of the two, i can not see how this woudl help me to evaluate whether an archive is solid or not.
Did i do it wrong? :D
Side-Note:
It seems like solid is the default for "7z -t7z", which is quite unintuitive, as the documentation only tells you how to enable solid ("-ms=on"), which suggests that off is the default. I may be wrong though; i just saw that "-ms=on" and "" spit out the same values for NumPackStreams/NumFolders/NumFiles, while "-ms=off" gave different results. Might be because i had quite small test archives.
… as an alternative solution, i saw that partly solid archives are possible (eg, blocks or 512KB solid data).
when archiving, is it possible to have a filter for files to be excluded from solid, or be placed in a separate solid block, and have all the rest in one big solid block?
that would actually be an even better solution then to disallow solid archives.
1) at decompression time:
You find meta-info file, detect what folder it's placed, check size of that folder, and if it's small (< 500 KB), you decompress it.
2) at compression time you can use 2 commands:
7z a a.7z meta-info
7z a a.7z other-files
So meta-info file will be in separate solid block.
There is a property named kpidSolid (PropID.h), you can query it with IInArchive::GetArchiveProperty
thanks you two! :-)
We will use the two things explained by ipavlov, as they are more flexible and suite us better, and cause the kpidSolid (using CArchiveDatabase::IsSolid() internally), is in the C++ lib only, and we use the C one.
The current (beta) rule we use to check if an extension is well packed, is checking the cost of extracting a meta-file:
this should work well for non-solid archives, tiny meta-files in small solid blocks, and big meta-files in a separate solid block.
thanks again… i quite like this solution :-)