Menu

Few technical questions

Help
kretos
2018-10-17
2018-10-18
  • kretos

    kretos - 2018-10-17

    Hi everyone.
    I'll be a new user of snapraid but I have few doubts that were not cleared after reading manual and faq and searching through forum (honestly - maybe it was answered before but there are 60 pages to look through...).

    So here are my questions:

    1. How exactly files from different data drives are matched to compute parity? Since snapraid support splited parity drives is it not based on cluster (sector) number? Or maybe it's based but snapraid adds offset on splited parity (place on second parity drive is calculated by number shifted by first drive)?
      Example:
      D1 and D2 are data drives, P are two smaller parity drives where 1 represent first parity drive and 2 represent second parity drive. Dots are only for aligning the picture
      D1......:...... xxxxxxxxxxxxxxxxxxxxxa
      D2......:...... yyyyyyyyyyyyyyyyyyyyyb
      P1&2: .....11111111112222222222
      Now both parity drives cover whole data drives but how snapraid find where to put parity value for a&b (last files on data drives)?
    2. Connected question:
      If data drive is physically larger (not divided to partitions) but only portion of data (smaller than sum of parity) is protected (n below not protected) - will it work?
      D1.....:...nnnnxxxxxxxxxxxxxxxxx
      D2.....:...............yyyyyyyyyyyyyyyyy
      P1&2:..............1111112222222222
    3. What happens when I move file around data disc. The path is changed but not file name or content - will parity be recalculated?
    4. What portion of data is recalculated if file is changed? Is there any overhead here? What will be recalculated if I change small file on first disc and on second there's huge file and both share parity information?
    5. Data disc - would it be possible to add few protected folders on single data drive (in one line like in parity definitions) like:
      data d1 F:\movies;F:\music
      I don't want to use pooling and don't want to use anonymous folder like array. I know about "include" but that would be easier to read.
    6. Parity file size - I think it would be useful to be able to set maximum parity file size - just in case when someone change the parity disc - copying dozen of files for 100GB can be easier then copy single few TB file. Also it would be more flexible - one could change single parity drive into few smaller.
    7. Parity - manual says "These files are read and written by the "sync" and "fix" commands". Why "fix" write parity - it's an error in manual?

    Thanks in advance for your answers.

     

    Last edit: kretos 2018-10-17
  • Leifi Plomeros

    Leifi Plomeros - 2018-10-17
    1. I think your illustration is close enough. In the config you can define blocksize (default 256 kiB). This determines the size of each "block" in the parity file. All protected files in each data disk is virtually chopped up in to one or more chunks of that size and related to a parity block each. The content file keeps track on these relationships, file names, last modified date, inode and so on.

    2. Yes.

    3. If you move and/or rename a file on the same data disk, snapraid will recognize it using inode, size and last modified and just update content file with new information about where to find the file. If you move the file to another disk it will be treated as removed from original disk and added to the other disk (double amount of parity needs to be updated). If snapraid recognize the file from the original disk it will show as copy and removed instead of added and removed.

    4. All parity blocks protecting the data of that file. Even if you modify only a single bit the file will get a new last modified time and snapraid will have to recalculate parity for all blocks related to the entire file. There is however no chain effect. Files from other disks will be read partially as needed to update the parity blocks.

    5. Nope. You need to use include or exclude options in that case.

    6. I agree. It would have been good if parity was automatically divided into smaller parts. But there are pitfalls related to that. Not cool if it is being divided into 100 GB chunks and you end up with 1-99 GB free on each parity disk. So it would need to be both fixed size chunks but also dynamic when disk is almost full. I suspect Andrea simply prefers a less complicated design. As workaround if you anticipate a future need to use smaller disks for parity you could setup multiple partitions on the parity disk s and use split parity. The files can later be moved to a single partition/disk or spread out on multiple smaller disks.

    7. How else is fix going to be able to fix corrupted data in the parity disk? :-)

     

    Last edit: Leifi Plomeros 2018-10-17
  • kretos

    kretos - 2018-10-17

    Thank you very much for your explanation!

    About 2&3 - great design! I thought that snapraid uses physical structure of hdd and it will require bigger parity that data disc even if inclusion and exclusion is used and only part of data should be protected. Good to know that it uses its own logical connection.

    About 5 - more suggestion or wish than question. Is there any thread were some improvements can be written?

    About 6 - I thought of maximum size - so no smaller gap would be present - sth like winrar parts - last one is usually smaller. Good hint about partitions.

    About 7 - I thought that sync reads data, writes parity and fix does exactly opposite - read parity and data and write missing data.

    Once again - thanks for quick answer.

     
  • Leifi Plomeros

    Leifi Plomeros - 2018-10-17

    Well... You still need larger parity file than amount of data. 1 KiB files will need a 256 kiB parity block and a 350 kiB file will need two blocks. Typically just leave a few GB unused space on each data drive if it is same size as the parity disk. If it is smaller you don't have to worry about it.

    Yes, sync only reads from data disks and writes to parity disks, but fix can restore/repair data on both data disks and parity disks.

     
  • kretos

    kretos - 2018-10-18

    Thanks once more. I'm aware of the parity demands. I always leave free space on hdd to allow proper defrag, also I will put only big files so it shouldn't be a problem. I've tried to hash ma image files but i've got almost 2 million files - content file was over 1.2 GB big and crawling throught directories took almost 20 minutes. So i've decided to just make simple copy on external hdd. I can afford for 3TB for images but not 15TB for all my files:)

     
  • Leifi Plomeros

    Leifi Plomeros - 2018-10-18

    An alternative to having an extra backup disk for images could be to replace the image folders with zip archives. I assume 99.9% of the images are static and browsing zip files is pretty much no different than browsing folders in Windows. But it all depends on the use case and whether or not you have indexing softwares that are compatible, etc. So not necessarily best option for everyone.

     

Log in to post a comment.

MongoDB Logo MongoDB