Menu

Additional information for repairing/restoring archive

2023-10-29
2023-11-12
  • Mikola Akbal

    Mikola Akbal - 2023-10-29

    Hi! 7z has excellent compressing ratio!

    WinRAR has one advantage: possibility to add additional information for repairing/restoring archive. I can set percentage, and additional information is added to archive.

    Could you, please, add the same feature to 7z?

    Second wish: please, make GUI interface more ergonomic.

     
    • Igor Pavlov

      Igor Pavlov - 2023-10-29

      Did you have many corrupted and recovered rar archives?
      And why these archives were corrupted?

       
      • Mikola Akbal

        Mikola Akbal - 2023-10-30

        Sometimes, I download corrupted archives of third parties. My archives never were corrupted. Maybe, it's because I always add additional information? You mean it never can be in practice?

        I love reliability. When I add additional information to important archives I feel myself more protected. It's important for good user expirience. Disk arrays with redundancy are used to guarantee keeping of information. If some sector on a disk with my archive will become broken, I will not lose my archive.

        Why are you so categorical? Sounds like you want assure me that my files will not be broken never. But if I accept your point of view, I lose feelness of safety.

        I choose RAR format only because it gives me possibility to add redundant data. I get archive that is 50% bigger than archive 7z, but redundancy is more important for me.

        Client is not always right? :) You decide what client need? Henry Ford said: "Customer can choose any color and model of car with condition it will be black Ford-T". The result: he lost market.

        Another important reason why I choose WinRAR is convenience, ergonomics of window interface. Users love convinience.

         
    • h11p5g

      h11p5g - 2023-10-30

      I use this https://github.com/Yutaka-Sawada/MultiPar for create records. It uses the reed-solomon algorithm ( https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction )
      This SW has already fixed several 7z archives for me.

      For Igor: Some protection is better than none.

       
      • Igor Pavlov

        Igor Pavlov - 2023-10-30

        What corruption type did you get and what reason of that corruption?

         
        • h11p5g

          h11p5g - 2023-10-30
          • random data (corrupted RAM)
          • NULL bytes (corrupted SSD / HDD)
          • moved data (low wifi signal)
          • missing data, for example, the archive is not completely copied / downloaded (process termination or interruption of the power source)

          A 7z archive has only one file table at the end. It is dangerous. When the file table is corrupted or missing, it becomes difficult to read the archive.

           
  • Igor Pavlov

    Igor Pavlov - 2023-10-30

    It can be difficult to implement.
    Also probably there are many different cases of corruption type.
    I suppose there is no universal solution that solves any type of corruption.
    So I don't want to waste time to implement that feature, if it will not solve many corruption cases.

    We don't want to get situation where users may mistakenly believe that they have good protection from corruption, while in reality there is no good protection still.
    There is no good criteria that shows how strong protection is, because we don't know distributions and types of all possible corruption types.

     
    • Mikola Akbal

      Mikola Akbal - 2023-10-30

      It can be difficult to implement.
      Also probably there are many different cases of corruption type.

      You may look how WinRAR does it. I always see message "adding additional information" at the end of compressing. It seems, WinRAR puts this redundant information at the end of file. At least this. It's better then nothing.

      You think about more difficult task? To distribute redundant information in whole archive body? How to protect archive against typical cases of information loosing? h11p5g gave cases. My cases are, usually, corrupted downloaded archives. It seems the end of archive is lost in these cases. So, it looks like WinRAR choosed good solution.

      The idea to keep file table at the beginning of archive looks sane.

      So I don't want to waste time to implement that feature, if it will not solve many corruption cases.

      It seems the most of cases are archives that are bad downloaded, that lost end of archive. If it is redundant information of WinRAR - you get whole archive.

      It seems, this redundant information allows to restore info even inside main array. I remember RAID5 disk arrays that have this algorithm of restoring.
      Maybe it will give you ideas how to implement it.

      We don't want to get situation where users may mistakenly believe that they have good protection from corruption, while in reality there is no good protection still.

      Вспомнился эпизод из фильма "Bullet train": "Бронежилет даёт ложное чувство защищённости, ведь могут стрелять в голову". (Поэтому нет смысла его одевать...) "Зато в грудь не даёт попасть. Но, видимо, эту серию "Томаса" ты пропустил".

      There is no good criteria that shows how strong protection is, because we don't know distributions and types of all possible corruption types.

      It seems you are scientist :) . 7z realizes compression very well. It seems, it is the best in compression ratio. I use your library in one of my projects.

      My expirience: it's usually bad downloaded archives, without end of archive. So, it looks like WinRAR choosed empirical good solution.

      Second good idea: to look inside algorithms of redundancy of RAID5. Because it protects any part of disk. Maybe it will give ideas.

       

      Last edit: Mikola Akbal 2023-10-30
      • Igor Pavlov

        Igor Pavlov - 2023-10-31

        It seems the most of cases are archives that are bad downloaded, that lost end of archive. If it is redundant information of WinRAR - you get whole archive.

        It's useless. Download error is rare case.
        And in average it's more effective to redownload smaller archive in case of error, than always download larger archive.

         

        Last edit: Igor Pavlov 2023-10-31
    • Mikola Akbal

      Mikola Akbal - 2023-10-30

      If you want wide using of 7z, you have to make window interface more convinient, more ergonomic. Let WinRAR inspire you.

      Ergonomics is one of my talents. If you will publish 7z with improved GUI, I will get my feedback. At this moment, it looks like product of laboratory for scientists :) . Users love simplicity.

      If you will need ergonomics feedback, you may contact me by e-mail: mikola.akbal@gmail.com.

       

      Last edit: Mikola Akbal 2023-10-30
  • isidroco

    isidroco - 2023-10-31

    When I need reliability I use multipar, with desired redundancy. That workaround makes it a non essential feature. What's infurating is essential missing: "extract only newer files" https://sourceforge.net/p/sevenzip/discussion/45797/thread/3d4d1bae/?limit=25#2edc

     
  • mdadm

    mdadm - 2023-11-12
     

    Last edit: mdadm 2023-11-12
  • mdadm

    mdadm - 2023-11-12

    I did some tests some time ago with Winrar and rar4/5 archives with this additional information for repairing/restoring archives. I modified/changed some part of archive, smaller (1 or few bits) or bigger in different part of archive in hex editor. Only one change at a time.

    The tests showed that even Winrar rar5 archive with recovery record is not so reliable as one think. It all depends in which part of file and how much bits are changed. Rar4 archive: over a dozen of kB, rar5: several dozen of kB of change is tolerated, but change of few bites in sensitive parts of file and you have problems. And if you have bigger archive, how you would know which bits/bytes to change and to what values?

    Other compression archive formats with recovery abilities and tools (but also not ideal):
    .lz - lzip, lziprecover
    .bz2 - bzip2, bzip2recover (better recovery with bigger archives with lots of small blocks - lower compression levels).

     

    Last edit: mdadm 2023-11-12

Log in to post a comment.