When you select a Zip file as the format, and then you set a password for it and tell it to use AES encryption, the built-in Windows Zip file extractor functionality is incapable of extracting it (if it's not encrypted, then Windows has no problem extracting it). It doesn't even ask for a password. That's when I realized that something was probably wrong with 7Zip's output. So I went to https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT and looked at the official Zip file specs, and noticed several things were wrong.
1) The general bit flags were wrong in both the local file header and the entry for the given file in the central file directory part of the Zip file. Flag bits 0 and 6 should both have been set. If bit0 = 0 then it's not encrypted. If bit0 = 1 then encryption is used, and bit6 tells the nature of the encryption. If bit6 = 0 then it is standard Zip encryption. If bit6 = 1 then strong encryption is used (the strong encryption algorithm, whether it's AES or something else, is specified in another part of the Zip file).
2) The compression method field in both the local file header and the entry for the file in the central directory section was wrong. The compression method your program set was method 99 (0x63), which is not listed as a valid method in the Zip file specification. I had selected store (no compression) as the method to use, so your program should have set the compression method field to 0 (zero).
I fixed these in a hex editor, and got Windows to at least recognize that it was stored and that it needed to ask for a password when extracting (something it never did before, it just refused to even try to extract anything, and said that it was using an unrecognized compression method). However it still doesn't recognize the password I type in, and always says incorrect password. I don't know how to fix that, as I haven't looked that thoroughly at the output of 7Zip's Zip file output. But it appears that something else must still be wrong (possibly in the encryption header section, that I have yet to look at).
Please fix this, so that I can create AES encrypted Zip files in 7Zip, that will be properly extracted by Windows built-in Zip file extraction functionality.
7-Zip uses WinZip's aes:
http://www.winzip.com/aes_info.htm
Unzip in windows doesn't support that method.
That may be so, but as you can see from the other things I mentioned in the bug report, based on analyzing the zip file in a hex editor and comparing it to the official Zip file specification document, there are a number of things in the Zip file output of 7Zip that are incorrect according to the official specification. Those should still be fixed in order to make sure that the Zip file output of 7Zip are correct Zip files. Otherwise there's no guaranty that any program other than 7Zip itself will be able to read them.
You don't understand situation.
There are two AES methods for zip format:
1) WinZip AES.
2) pkware AES. It was more difficult to implement pkware AES.
Note that appnote specification still doesn't describe some important details required to implement pkware AES method.
So WinZIP's AES was much more simpler to implement. And big number of ZIP programs use WinZIP's AES. 7-Zip also uses WinZip AES. But Appnote from pkware doesn't mention WinZip's AES method. So don't look Appnote for such zip archives created by 7-Zip.
Zip in Windows probably doesn't support any AES method.
You can check it.
about bit 6: Strong encryption.
I suppose "Strong encryption" means pkware's "Strong Encryption", not WinZip's AES encryption.
So I don't set bit 6.
Can someone create ZIP with AES in WinZip and check Bytes 6,7 in hex editor?
50 4B 03 04 33 00 01 00 63 00
Why would they not provide enough info to create PKWare's AES? Is that some proprietary patented method that only they can use?
I don't know why. They describe maybe 90% of required information in appnote. But without another 10% it was difficult to support it.
Note also that winzip aes was implemented before pkzip aes.
So there were reasons to use WinZip's AES that time.
That's strange. I remember the old dos software pkunzip which I assume is made by the same company that makes pkzip, including the current Windows version. I remember back in the old days of downloading DOS games from Apogee's (now 3D Realms) bulletin board using a telephone-line connected modem (even before dial-up connection to the internet, because there was no internet at the time). I was about 5 or 6 at the time, and I remember when downloading these games, the download was of a zip file that contained the game's installer executable and other associated files. I remember needing a copy of pkunzip to unzip the contents of these downloaded zip files so that I could install the games. And since the PK company dates from so long ago, I'm pretty sure that PK is the company that INVENTED Zip files. So I just assumed that any official standards that exist for the Zip file format, were written up by PK. So I assumed that the proper (and oldest) specification for AES encryption in Zip files is the spec created by PK, not the spec created by WinZip.
Last edit: Animedude5555 2015-01-08
As I remember WinZip-AES was before PKWARE-AES.
Note that WinZip was more popular that pkzip at some point of time. So there were big reasons for other developers to support WinZip changes to zip format.
Some programs support both AES methods.
Now 7-Zip also supports decryption of pk-AES.
But it was difficult to implement some things in pk-AES without full specification.
WinZIP AES is simpler and has full specification.
Not exactly. PKWARE's strong encryption pre-dates WinZip's AES by quite a while. In fact, it pre-dates AES as such (the competition to decide the algorithm for AES was well underway, and many people were already assuming rjindael was going to win, and it did). PKWARE didn't add AES to the algorithm choices for that encryption scheme until after WinZip released their AES implementation, because the cryptographic library PKWARE used on Windows didn't yet have support for AES. The library PKWARE used for AIX, HP-UX, Linux, and Solaris did have AES support much sooner, but supporting different algorithms on different platforms was undesirable. Using an alternative library on Windows for AES support was considered, but deemed undesirable for some good, but complex reasons.
Igor: What do you see as missing from the specification? The only field that wasn't documented is related to certificate-based encryption, not password-based encryption. And that one field accounts for far less than 10%. As the guy who wrote what's in PKWARE's appnote, I'm really curious to know. I'm no longer in a position to fix it (I don't work there anymore), but I can certainly answer any questions (except that one field) without violating anybody's IP or NDAs. You may recall we had conversed previously on LZMA & PPMd in ZIP files.
APPNOTE:
But it must be something like this:
PAD16 - 16 bytes of some padding data.
That "PAD16" data is not mentioned in specification.
How developer can guess about that thing?
And why did they use that PAD16?
Last edit: Igor Pavlov 2017-02-11
Looks like the document was modified since I last saw it. Part of it is probably clearer, but somewhere along the way, a general statement on padding was lost. It may even have been lost prior to first publication, in which case, that's almost certainly my fault, and I offer my apologies. The piece you're missing there is that all block cipher operations are done with PKCS#7 padding to ensure the input is always a multiple of blocksize. With most of the common C-based libraries, the PKCS#7 padding is automatically detected and stripped from the output during decryption. Most of those libraries also default to using PKCS#7 padding, and will automatically add it to the input during encryption, when you indicate you're done passing data to encrypt. I suspect the comment was unintentionally dropped when those sections were re-organized and nobody noticed because there's probably still no explicit code to indicate PKCS#7 padding.
As for why there's padding even when the input is a multiple of blocksize, please see PKCS#7, or the equivalent S/MIME or CMS RFC. To save some time, however, the padding scheme requires there always be at least one byte of padding. So if the input is a multiple of blocksize, to get at least one byte of padding, you must add blocksize of padding.
Finally, I would note that AES and twofish are the only block ciphers (currently allowed in the spec) with a block of 16 bytes. All other (block cipher) algorithms allowed would use a block size of 8 bytes. Please do not code with the assumption of a block size of 16 bytes. 3DES was still very popular as recently as 2 years ago, and it has a block size of 8 bytes. I also want to point out, specifically, that due to limitations with older versions of Windows, some users may have had a compatibility setting enabled, which causes the RD to be encrypted with 3DES, rather than AES, even if the file is encrypted with AES. IIRC, that option was removed from the programs 8 or 9 years ago, but you may still encounter such archives for decryption. It'd be unfortunate if your code assumed an incorrect block size for that, but then again, maybe not. :) This difference in encryption algorithms is noted in section 7.2.4.6 of version 6.3.4 of APPNOTE.TXT (and should be noted in all prior versions, too). There's no guidance on this, but when encrypting, you should NEVER set that flag.
Oh, I see the field I referenced as undocumented earlier is now documented.
Thanks for that description!
I'll update 7-zip code to check data from that PKCS#7 padding.
There is no DES code in 7-zip.
7-Zip now supports only AES in Strong Encryption.
So 7-zip works now only for 16 bytes blocks.
Is there some source where we can download test archives with different encryption methods?