Validate contents in compressed file

AliBy
2013-06-19
2013-06-29
  • AliBy

    AliBy - 2013-06-19

    I wanted to know if there is a way to check that the contents of a compressed file is identical to the files it compressed. Kind of like after you have written files to CDRom you can verify that all the files are the same.

    This would provide 2 useful tests:

    1. You can check to see that the 7z file contains the latest files without actually rerunning the compression
    2. You can see if any of the contents that were extracted from a 7z file have been changed from the original source. Thereby rename those files if you don't want them over written when you extract the original from the compressed file.
     
  • Shell

    Shell - 2013-06-29

    There is only one way to know for sure that the files are identical - to decompress an archive and then compare the files byte-to-byte. There are, however, two empiric checks that are much faster: comparing the sizes, the modification dates and the checksums. They all are stored in an archive and are shown in 7-Zip File Manager.

    If the checksums differ, then the files definitely differ, too. If not, the files are probably identical (especially if their sizes are equal and their dates are the same or differ by 1-2 seconds). You can calculate the checksum of an uncompressed file via 7-Zip File Manager (File->Calculate checksum); the procedure is as fast as copying the file. It will actually calculate several checksums, but you will probably need CRC32. This method works for 7z and most other archives, but it will not work for some old archives (which use CRC16) and for some RAR 5.0 archives (which may use BLAKE2; however, 7-Zip 9.30 and below won't open them at all).

    If the sizes differ, then the files definitely differ, too. If not, they may be identical, but may be not. If the dates differ, then the files probably differ, too. These checks are not as reliable as comparing the checksums, but they are instant since the size and the modification date are always stored on the disk (unlike any checksum). 7-Zip checks the sizes and the dates before compressing in Update mode, so it will not actually compress your files unless they are newer than those in the archive. This is close to what you suggest in the test #1. However, 7-Zip does not yet perform this check when extracting files, so you should compare the checksums manually. Some other archivers allow you to skip identical files upon extraction.

     
    Last edit: Shell 2013-07-02

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks