Menu

Verify archive hash before files are extracted

2023-07-25
2023-08-09
  • Thomas Schilling

    Hello,

    the command line version can already calculate the hash value of given files (command h).

    It would be great if a command line switch could be provided to verify the archive file hash before the archive file is extracted (command e or x). If the hash value is not as expected, the program should terminate with an error code "invalid hash" and not start the extraction.

    Something like this:

    7z x -scrc=SHA256:3DCE36D583BA1C741E95DF1A265E47F0DE581BEF77AB48165DD67266BE7A42EF -- somearchive.7z
    7z x -scrc=BE89B488B258555A8CF971E4D29C40CE92BF881D -scrc=3DCE36D583BA1C741E95DF1A265E47F0DE581BEF77AB48165DD67266BE7A42EF -- somearchive.7z
    7z e -scrc=SHA1:BE89B488B258555A8CF971E4D29C40CE92BF881D,SHA256:3DCE36D583BA1C741E95DF1A265E47F0DE581BEF77AB48165DD67266BE7A42EF -- somearchive.7z
    7z e -scrc=@hashfile -- somearchive.7z

    If not explicitly given, the hash algorithm could be determined by the given length of the hash value. If multiple hash algorithms are available, the given hash value has to match the value of any algorithm.

    In especially in batch scripts, that download software from the internet (wget), extract the software from the downloaded archive and start a (daemon) process, this would be helpful to verify the downloaded content. The load balancer of the AWS cloud p. ex. starts additional server instances when needed, using a batch script to setup the new virtual (HTTP) server instance.

    What do you think about this?

    Best regards,
    Thomas Schilling

    Update Aug 9, 2023
    Worded differently for better understanding

     

    Last edit: Thomas Schilling 2023-08-09
  • therube

    therube - 2023-08-02

    Would testing the archive itself suffice - as opposed to testing the hash of the downloaded file?

    I'd think that easy to automate?
    Something like (pseudo-code):

    7z -bse0 t archive.7z > nul
    if %ERRORLEVEL% EQU 0 (7z x archive.7z)

     
  • therube

    therube - 2023-08-02

    If your intention is to avoid extracting the archive contents where the archive hash does not match expected hash (rather then the hashes of the file contents - like so you won't extract a "rouge" download), that is different.

    Though you should still be able to automate that?

     

    Last edit: therube 2023-08-02
  • Thomas Schilling

    The second scenario is correct. Prevent an archive from being extracted if the archive file hash mismatch.

    This feature could be useful and is probably easy to implement since the cryptographic hash algorithms (like SHA256) are already implemented for the hash command.

    I currently don't have a task to automate anything myself, but I had this idea and wanted to put it up for discussion.

    7z is very likely used in many workflows to extract archives downloaded from the internet. So if it is easy to verify that the intended archive was downloaded and not a malicious archive, this will hopefully be used often.

    (If possible I will update the initial post to make it more clear)

     

Log in to post a comment.