Today we have noticed our batch job was not able to unpack few Arj archive files unlike all previous days we used 18.05 x64 on Windows 10 x64. We have quickly checked one of those files with all major versions backward until the 9.20 x64 success.
All Arj files created with ARJ 3.08a (ARJ32) Copyright (c) 1990-2000 ARJ Software, Inc. Oct 11 2000.
v18.05 wrongly tells:
7-Zip 18.05 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-04-30
Scanning the drive for archives:
1 file, 7500 bytes (8 KiB)
Extracting archive: BN307021806010001.arj
Can't open as archive: 1
Files: 0
Size: 0
Compressed: 0
Well, I like these tools too. But for BATCH jobs this list includes also:
4) old ARJ
5) old 7z 9.20
Why did NEW 7z abandon to read the ASN.1 format files?! It is the future of the Russian signed data aka PKCS#7 accepted! And all tools in this list operate with that data inside. We liked 7z for its reading of everything...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
By default 7-Zip doesn't like any data before archive.
The reason:
you can have 1 GB arj file that contains another 1 KB arj file without compression.
And if start bytes of big 1 GB arj file are corrupted, then if 7-zip opens 1 KB arj file instead, but actually you wanted to open big 1 GB file. So it's better to show error in such case, then make wrong open operation for small 1 KB archive.
But you can extract your archive so:
7z x BN307021806010001.arj -tarj
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your good reason explanation.
But we use the ability of 7z to unpack everything (all unknown receivings for our batch job) starting from Cab to Arj, Zip, etc (option -t unaplicable), w/out PKCS#7 around. Just as FAR does it. So we will revert to use 7z 9.20 as before.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It might be right for 7-Zip to recognize the strong ASN.1 file format (please see RFC for the Cryptographic Message Syntax, CMS) inside/around archives before their processing. It really is not "any data before archive" to just remove first 65 bytes or to seek for known archive signatures.
And it will be for the glory of 7-Zip itself above others - to support the world wide enterprise standards!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1) if it's not pure ARJ, then probably you can use additional extension, like file.arj.asn1
2) if it's popular format, then provide some link that describes signatures and headers, Also write about software that uses that format.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1) FAR opens everything. It is a really nice feature, without looking into extensions. Nowadays the Bank of Russia with some Federal Services or Russia create more and more file formats with wrong extensions. So its systems main data store keeps various files with an extension of time (.hhmmss) that file was received. 7z 9.20 worked nice with it.
2) An introduction (in the Russian language) to ASN.1 might be https://habr.com/post/150757/. I use a viever at http://lapo.it/asn1js/ as reference to my development of http://dievdo.ru/PTK-PSD-Browser-hta/. Please load my attached file to this viewer to see the internal tree structure of this enveloping format. Whole data exchange with the Bank of Russia and the Federal Services moves to this PKCS#7 standardized format of signed (and optionally crypted) data enclosed. Please look at http://www.cbr.ru/collection/collection/file/4413/inf_mci_48(2015).pdf (in Russian) to see about implementing of this standard into everything. The Russian State Structures still require to use old Arj because it traditionally can make multivolume archives or proprietary WinRAR or conventional Zip for smallers (and Cab as transport envelopes for all them). We love 7z to eat them all in our batch jobs!
If you cannot read Russian, I will provide international links. The Bank of Russia references to RFC3369 in its regulations. It is not the latest CMS standard (https://en.wikipedia.org/wiki/Cryptographic_Message_Syntax) but the Bank of Russia is now the main regulator for all banks and insurance companies in Russia.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
As before in 1918, we still use 7z 9.20 for ASN.1 encoded files used widely in the cryptographic and signature services. For example every TLS certificate around is ASN.1 encoded. All files around are becoming digitally signed. We need to have a fast preview of some archives enclosed deep recursively with some signed/encrypted of them without any PKCS #7 software installed.
The old good 7z 9.20 might easily extract a solid data part from PKCS #7 signedData message, but now it fails to parse the same data splitted with the ITU-T standard X.690 Indefined form (advanced for streamed data) separators inside data as specified at https://en.wikipedia.org/wiki/X.690
How I do it in my software. If a file starts with "0", it is a highly probable PKCS #7 strongly structured file. It might be much better to parse the entire tree ASN.1 structure, but I do just a simple seek for OID 1.2.840.113549.1.7.1 (hex sequence of 06 09 2A 86 48 86 F7 0D 01 07 01), then parse the length of a block and read the data (Definite form of X.690) or parse the lengths of chunks delimited 04 until 00 00 ends the stream (Indefinite form). I attach here both of these samples parsed with the excellent ASN.1 JavaScript decoder by Lapo Luchini http://lapo.it/asn1js/
So, those source files start with "0" (0x30) and enclose the "signedData" PKCS #7 structure. The CAB archive inside starts with its signature "MSCF" (4D 53 43 46 ...) in the "data" PKCS #7 field. And this is the field that data need to be extracted as the destination content (as a clean CAB file with an enclosed ARJ file inside and so on deeper...)
Sorry, I do not attach those source files (I attach just images of some parsed data) due to security reasons for a public forum.
And at the last just few notes about Usage of this format from Wikipedia (link above):
BER is a popular format for transmitting data, particularly in systems with different native data encodings.
The SNMP and LDAP protocols specify ASN.1 with BER as their required encoding scheme.
The EMV standard for credit and debit cards uses BER to encode data onto the card
The digital signature standard PKCS #7 also specifies ASN.1 with BER to encode encrypted messages and their digital signature or digital envelope.
Many telecommunication systems, such as ISDN, toll-free call routing, and most cellular phone services use ASN.1 with BER to some degree for transmitting control messages over the network.
GSM TAP (Transferred Account Procedures), NRTRDE (Near Real Time Roaming Data Exchange) files are encoded using BER.
DER encoding is widely used to transfer digital certificates such as X.509.
I develop any new code, if I have test example files to test that new code. You must provide:
1) test example files
2) some proof that your problem is important for many users.
Then I think how diffcult to implement new code. If it's not difficult, I can try it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1) well, I attach those two files. Some later I ask you to remove them (or I will do it).
2) these files are sent/received by every bank in the country, 100-300 files per a day each. I maintain a popular repo to quickly browse and control them - https://dievdo.ru/PTK-PSD-Browser-hta/ Yesterday I was asked by users from few various cities to quickly fix the sender's changes when it moved to the new type of encoding. The official client processes these changes, but it is so very unfriendly and requires PKCS #7 software installed that my browser replaces it at all. Many users exist but they are not developers to make me stars on GitHub :)
My JavaScript code to extract PKCS #7 data is here. There is a JS hack to read byte values from UTF strings, but comments explain. Also I do not parse the ASN.1 tree, but I just seek the byte signature of the known OID and then I parse data from the found position.
My code to run 7z (not so interested for this topic, but it lookups for different versions of 7z, c'est la vie) is here.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
after x -t#you have 2.12345_1.cab that is required cab file.
But then 7-Zip can't extract arj from this cab.
So it's not original cab format, but it's some "modified cab" format.
Is it encrypted?
Who modified cab format?
Why they did it?
Who use it?
Where this modification is described?
Please write simpler without long messages.
Last edit: Igor Pavlov 2023-02-10
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
For each new feature (or format) there are many factors.
It's like scoring system where each factor can add or remove points:
- how difficult to study, develop, debug, and support it, and how many hours it requires.
- how useful it will be, and how many users will use it.
- what code size of that new code.
For example, if the new code will increase the program size for 1%, but only 0.001% of users will use it, I don't want to support it.
Now I don't understand the complexity of your problem, I don't understand required changes and I don't understand the level of usefulness.
Last edit: Igor Pavlov 2023-02-17
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well, I just repeat the 2018's topic "7z.exe cannot unpack some Arj archives".
Well, now more fine tuned: 7z cannot unpack something enclosed into a PKCS #7 / CMS / ASN.1 / X.690 cryptographic industry standard container and it's author does not want to easily add the unpacking of .p7s files to his glory.
It is still not a problem for me to unpack that with few lines of pure JavaScript.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Today we have noticed our batch job was not able to unpack few Arj archive files unlike all previous days we used 18.05 x64 on Windows 10 x64. We have quickly checked one of those files with all major versions backward until the 9.20 x64 success.
All Arj files created with ARJ 3.08a (ARJ32) Copyright (c) 1990-2000 ARJ Software, Inc. Oct 11 2000.
v18.05 wrongly tells:
and v9.20 successfully tells:
I attach a sample Arj file here. Internal Xml files are encrypted so do not try to see the XML code inside.
Last edit: Dmitrii Evdokimov 2018-06-25
Last edit: Dmitrii Evdokimov 2018-06-25
1) You can use 7zFM File -> Open Inside #
2) Remove first 65 bytes from .arj file
3) Use farmanager arclite
Well, I like these tools too. But for BATCH jobs this list includes also:
4) old ARJ
5) old 7z 9.20
Why did NEW 7z abandon to read the ASN.1 format files?! It is the future of the Russian signed data aka PKCS#7 accepted! And all tools in this list operate with that data inside. We liked 7z for its reading of everything...
By default 7-Zip doesn't like any data before archive.
The reason:
you can have 1 GB arj file that contains another 1 KB arj file without compression.
And if start bytes of big 1 GB arj file are corrupted, then if 7-zip opens 1 KB arj file instead, but actually you wanted to open big 1 GB file. So it's better to show error in such case, then make wrong open operation for small 1 KB archive.
But you can extract your archive so:
Thanks for your good reason explanation.
But we use the ability of 7z to unpack everything (all unknown receivings for our batch job) starting from Cab to Arj, Zip, etc (option -t unaplicable), w/out PKCS#7 around. Just as FAR does it. So we will revert to use 7z 9.20 as before.
It might be right for 7-Zip to recognize the strong ASN.1 file format (please see RFC for the Cryptographic Message Syntax, CMS) inside/around archives before their processing. It really is not "any data before archive" to just remove first 65 bytes or to seek for known archive signatures.
And it will be for the glory of 7-Zip itself above others - to support the world wide enterprise standards!
1) if it's not pure ARJ, then probably you can use additional extension, like
file.arj.asn1
2) if it's popular format, then provide some link that describes signatures and headers, Also write about software that uses that format.
1) FAR opens everything. It is a really nice feature, without looking into extensions. Nowadays the Bank of Russia with some Federal Services or Russia create more and more file formats with wrong extensions. So its systems main data store keeps various files with an extension of time (.hhmmss) that file was received. 7z 9.20 worked nice with it.
2) An introduction (in the Russian language) to ASN.1 might be https://habr.com/post/150757/. I use a viever at http://lapo.it/asn1js/ as reference to my development of http://dievdo.ru/PTK-PSD-Browser-hta/. Please load my attached file to this viewer to see the internal tree structure of this enveloping format. Whole data exchange with the Bank of Russia and the Federal Services moves to this PKCS#7 standardized format of signed (and optionally crypted) data enclosed. Please look at http://www.cbr.ru/collection/collection/file/4413/inf_mci_48(2015).pdf (in Russian) to see about implementing of this standard into everything. The Russian State Structures still require to use old Arj because it traditionally can make multivolume archives or proprietary WinRAR or conventional Zip for smallers (and Cab as transport envelopes for all them). We love 7z to eat them all in our batch jobs!
If you cannot read Russian, I will provide international links. The Bank of Russia references to RFC3369 in its regulations. It is not the latest CMS standard (https://en.wikipedia.org/wiki/Cryptographic_Message_Syntax) but the Bank of Russia is now the main regulator for all banks and insurance companies in Russia.
As before in 1918, we still use 7z 9.20 for ASN.1 encoded files used widely in the cryptographic and signature services. For example every TLS certificate around is ASN.1 encoded. All files around are becoming digitally signed. We need to have a fast preview of some archives enclosed deep recursively with some signed/encrypted of them without any PKCS #7 software installed.
The old good 7z 9.20 might easily extract a solid data part from PKCS #7 signedData message, but now it fails to parse the same data splitted with the ITU-T standard X.690 Indefined form (advanced for streamed data) separators inside data as specified at https://en.wikipedia.org/wiki/X.690
How I do it in my software. If a file starts with "0", it is a highly probable PKCS #7 strongly structured file. It might be much better to parse the entire tree ASN.1 structure, but I do just a simple seek for OID 1.2.840.113549.1.7.1 (hex sequence of 06 09 2A 86 48 86 F7 0D 01 07 01), then parse the length of a block and read the data (Definite form of X.690) or parse the lengths of chunks delimited 04 until 00 00 ends the stream (Indefinite form). I attach here both of these samples parsed with the excellent ASN.1 JavaScript decoder by Lapo Luchini http://lapo.it/asn1js/
So, those source files start with "0" (0x30) and enclose the "signedData" PKCS #7 structure. The CAB archive inside starts with its signature "MSCF" (4D 53 43 46 ...) in the "data" PKCS #7 field. And this is the field that data need to be extracted as the destination content (as a clean CAB file with an enclosed ARJ file inside and so on deeper...)
Sorry, I do not attach those source files (I attach just images of some parsed data) due to security reasons for a public forum.
And at the last just few notes about Usage of this format from Wikipedia (link above):
I develop any new code, if I have test example files to test that new code. You must provide:
1) test example files
2) some proof that your problem is important for many users.
Then I think how diffcult to implement new code. If it's not difficult, I can try it.
1) well, I attach those two files. Some later I ask you to remove them (or I will do it).
2) these files are sent/received by every bank in the country, 100-300 files per a day each. I maintain a popular repo to quickly browse and control them - https://dievdo.ru/PTK-PSD-Browser-hta/ Yesterday I was asked by users from few various cities to quickly fix the sender's changes when it moved to the new type of encoding. The official client processes these changes, but it is so very unfriendly and requires PKCS #7 software installed that my browser replaces it at all. Many users exist but they are not developers to make me stars on GitHub :)
My JavaScript code to extract PKCS #7 data is here. There is a JS hack to read byte values from UTF strings, but comments explain. Also I do not parse the ASN.1 tree, but I just seek the byte signature of the known OID and then I parse data from the found position.
My code to run 7z (not so interested for this topic, but it lookups for different versions of 7z, c'est la vie) is here.
There are two ways to unpack that file:
1) Parser mode:
2) Rename file from cab extension to any other non-archive extension:
1) 7z x a.cab -t#
Result: 3 files (1, 2.12345_1.cab, 3) instead one AFN_MIFNS00_4030702_20230206_00005.arj
2) 7z.exe x a.rrr
Result: AFN_MIFNS00_4030702_20230208_00001.arj of 0 bytes
after
x -t#
you have2.12345_1.cab
that is required cab file.But then 7-Zip can't extract arj from this cab.
So it's not original cab format, but it's some "modified cab" format.
Is it encrypted?
Who modified cab format?
Why they did it?
Who use it?
Where this modification is described?
Please write simpler without long messages.
Last edit: Igor Pavlov 2023-02-10
https://cbr.ru/development/feddc/fns/
Last edit: Igor Pavlov 2023-02-17
For each new feature (or format) there are many factors.
It's like scoring system where each factor can add or remove points:
- how difficult to study, develop, debug, and support it, and how many hours it requires.
- how useful it will be, and how many users will use it.
- what code size of that new code.
For example, if the new code will increase the program size for 1%, but only 0.001% of users will use it, I don't want to support it.
Now I don't understand the complexity of your problem, I don't understand required changes and I don't understand the level of usefulness.
Last edit: Igor Pavlov 2023-02-17
Well, I just repeat the 2018's topic "7z.exe cannot unpack some Arj archives".
Well, now more fine tuned: 7z cannot unpack something enclosed into a PKCS #7 / CMS / ASN.1 / X.690 cryptographic industry standard container and it's author does not want to easily add the unpacking of
.p7s
files to his glory.It is still not a problem for me to unpack that with few lines of pure JavaScript.