Help w/ file archive format comparison chart

2005-03-03
2012-12-08
  • Wesley Leggette

    Wesley Leggette - 2005-03-03

    I'm making a file archive format comparison chart, and I would like any comments regarding the accuracy of that chart. I've had some trouble filling in the information on 7z, so here goes:

    archive limitations
    -- file entries                        ???
    -- entry size                          16 EiB
    -- entry name length                   ???
    compression                            yes
    -- formats                             LZMA, deflate, bzip2, others
    encryption                             yes
    -- formats                             AES-256
    -- modes                               ???  (probably ECB?)
    authentication                         ???  (something like rar's thing?)
    disk spanning                          yes
    implicit archive modification          ???  (I'm assuming yes?)
    differential backup support                  <-- if can be tracked in-archive
    -- added, changed files                ???
    -- deleted file marker                 ???
    archive file access                          <-- what modes are possible?
    -- true consecutive                    ???
    -- non-consecutive                     yes
    special files
    -- hard links                          no
    -- posix file types                    no
    metadata
    -- time attrs                          ??? (assuming msdos' attrs?)
    -- posix user data                     ???
    -- posix permissions                   ???
    -- dos attributes                      ???
    -- extended attributes                 ???
    -- archive comments                    ??? (I think no?)
    -- comment per entry                   ??? (I think no?)
    -- multi-fork data                     no  (macintosh stuff)
    time storage format                    ???  (win32 I'm assuming?)
    archive corruption
    -- detection                           no
    -- recovery                            no
    catalog isolation                      no
    internal catalog position              ??? (end or beginning of file?)
    solid file archives                    yes
    archive "lock"                         no

    The chart I'm working on can be found at <http://darbinding.sourceforge.net/about_dar.php>.

    If anyone can help with these values, or any other values in the full chart I'm working on, I would greatly appreciate it.

    Also, from what I can tell there is no written 7z archive specification, but if there is where could I find it?

    Thanks!

     
    • Igor Pavlov

      Igor Pavlov - 2005-03-03

      archive limitations
      -- file entries                        2^64/2^31
      -- entry name length                   2^64/2^31
      Current 32-bit implementation supports 2^31 entries. But format allows 2^64 items.
      encryption                             yes
      -- modes                               CBC
      authentication                         No
      disk spanning                          yes
      implicit archive modification          what is it?
      differential backup support          
      -- added, changed files                Yes
      -- deleted file marker                 Yes
      archive file access                          <-- what modes are possible?
      -- true consecutive                    no
      -- non-consecutive                     yes
      special files
      -- hard links                          no
      -- posix file types                    no
      metadata
      -- time attrs                          NTFS Last Modifed (8 bytes) UTC
      -- posix user data                     no
      -- posix permissions                   no
      -- dos attributes                      Yes
      -- extended attributes                 ?
      -- archive comments                    No
      -- comment per entry                   No
      -- multi-fork data                     no
      time storage format                    NTFS Last Modifed (8 bytes) UTC
      archive corruption
      -- detection                           probably Yes
      -- recovery                            no
      catalog isolation                      catalog is only one place in the end of archive?
      internal catalog position            end or file

      > Also, from what I can tell there is no written 7z archive specification, but if there is where could I find it?

      7-Zip source code package has DOCs.

       
      • Wesley Leggette

        Wesley Leggette - 2005-03-04

        Thanks!

        2^31 -- this is a signed int?

        implicit archive modification -- this is subjective really, it just means that the archive file format is meant to allow adding/deleting/changing files after it is written. I.e. "dar" archives cannot be modified after they are written (unless the library goes to extreme lengths to implement it). I consider this a little bit different than differential tracking (see next).

        differential backup support -- just to make sure, how does this work with 7z? There's a method of tracking which files were added/changed/deleted over time?

        archive file access -- the true consecutive (to allow pushing over a pipe) and non-consecutive are the modes.

        extended attributes -- NTFS and many unix FS's support "extended attributes"--but I think they're called the same in NTFS--which are a just a list of name/value pairs. From the 7z specs, it looks like they're supported, but I can't tell for sure. I'll put down whatever you tell me.

        -----

        Ha! I when I looked myself I thought I couldn't find them, but I found out that I'd unzipped it improperly. Thanks!

        -----

        ALSO: I've read the file specs and have more questions:

        'AdditionalStreams' -- 7z can store additional NTFS streams? (awesome if so). To your knowledge, do you know any other file formats that do that?

        'CodersInfo[NumCoders]' -- what is this?
        'BindPairsInfo' -- this?

        In the structure 'FilesInfo', this contains an array of files--does this treat the directories as files as well and store their properties? -- From what I read, yes, right? They are marked with the 'kFolder' propID?

        Although, I am a bit confused as to how properties are stored. Is it a property array for each file?

        FilesInfo
        {
        ....Num Files
        ....Properties[]
        ....{...............<--- each array element contains a struct like this?
        .......ID..............<--- this is the NID?
        .......Size.......<--- size of actual info data?
        .......Data
        ....}
        }

        For the above, I see a lot of 'BYTE External;' mentioned. What is this?

         
        • Igor Pavlov

          Igor Pavlov - 2005-03-04

          > 2^31 -- this is a signed int?

          Yes.

          > There's a method of tracking which files were added/changed/deleted over time?

          No. 7-Zip can only create incremental archives.

          > 'AdditionalStreams' -- 7z can store additional NTFS streams? (awesome if so).

          AdditionalStreams are for headers compressing.

          > To your knowledge, do you know any other file formats that do that?

          WinRAR - as I suppose.

          > 'CodersInfo[NumCoders]' -- what is this?

          For example, AES is coder#1, and LZMA is coder#2

          > 'BindPairsInfo' -- this?

          BindPairsInfo for binding of stereams of coders.

          > does this treat the directories as files as well and store their properties?

          Yes.

          > They are marked with the 'kFolder' propID?

          No. Folder in 7zip format means "solid" block (it's CAB format notation).

          > Although, I am a bit confused as to how properties are stored. Is it a property
          array for each file?

          At first 7z stores all "name" properties for all files, then all "time" properties. Such order increases compression ratio of headers.

          > BYTE External

          Each property array allows separated compressing. For example, it can use one compressipon method for FileNames and other method for FileTimes. Or It can compress them as one block with one method. In latest versions, 7-Zip compresses archive headers as one block.

           

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks