Adding a 3rd parity destroyed the other two parities

Help
2016-09-04
2016-10-13
  • Valentin Hilbig

    Valentin Hilbig - 2016-09-04

    Please note:

    Thank you for SnapRAID! This report here is in the hope to be helpful, such that SnapRAID can be improved. I do not need help.

    This is just what I observed in my setup, so perhaps see this as a bug report. But I think the FAQ definitively needs some improvement how to deal with following (probably not very uncommon) situation:

    • You have 2 parity drives
    • The array is fully synced
    • The array is not fully scrubbed
    • You add a 3rd parity
    • You run: snapraid -F sync
    • While this runs one drive fails.
    • Also there is a silent corruption on a second drive (in the non-scrubbed part)

    The manual states:

    Note that in the process, you will be always protected, because the existing parity is not modified.

    I don't think this is correct, because due to the sync the 2 existing parities become partly unusable. Hence you are NOT fully protected until the parity is fully re-computed, as you must not interrupt the sync.

    In my case:

    • All drives are 100% perfectly ok at my side.
    • All data was (and is) 100% perfectly ok at my side.
    • No data added, no data removed, no data altered, so the array has the exact same state as before the parity was added.

    So all I need to do is to run the sync until the array is fully synced again. Which takes ages, but this is due to my setup here. And hopefully, nothing bad happens while this runs.

    Here is my observation:

    I decided to add the 3rd parity in following config:

    block_size      1024
    autosave        100
    
    content         /keep/.conf/content0
    
    1-parity        /keep/.P1/parity1
    content         /keep/.P1/content1
    
    2-parity        /keep/.P2/parity2
    content         /keep/.P2/content2
    
    3-parity        /keep/.P3/parity3
    content         /keep/.P3/content3
    
    disk    yt1     /keep/yt1/yt1
    disk    yt2     /keep/yt2/yt2
    disk    yt3     /keep/yt3/yt3
    disk    yt4     /keep/yt4/yt4
    disk    yt5     /keep/yt5/yt5
    disk    yt6     /keep/yt6/yt6
    disk    yt7     /keep/yt7/yt7
    disk    yt8     /keep/yt8/yt8
    disk    yt9     /keep/yt9/yt9
    disk    yt10    /keep/yt10/yt10
    

    Then I did: snapraid sync

    It told me that I need -F, so I ran: snapraid -F sync

    Progress showed this takes several days. After running it a while I needed to interrupt it. (Unfortunately I have no log of this run, as I did it manually, because my scripts around snapraid do not support -F.)

    Then I did a scrub, to scrub some old data, which I also had to interrupt, as usual, I never observed any harm due to this.

    However this time the scrub reported errors. This is the log of the scrub:

    version:9.2-101-g7992ca1
    unixtime:1472994965
    time:2016-09-04 15:16:05
    command:scrub
    argv:0:snapraid
    argv:1:-v
    argv:2:-v
    argv:3:-l
    argv:4:/keep/.conf/log.scrub.20560
    argv:5:-p
    argv:6:10
    argv:7:scrub
    selftest:
    msg:progress: Self test...
    conf:file:/etc/snapraid.conf
    uuid:by-uuid:253:74:4371c46a-0064-4c98-bef8-044c7dee267e: found ../../dm-74
    uuid:by-uuid:253:72:d9c3f848-d2b7-415d-ad4b-764a56c617d8: found ../../dm-72
    uuid:by-uuid:253:79:c32186ce-0b82-4837-8361-c85fcc994030: found ../../dm-79
    uuid:by-uuid:253:76:1044a4ce-edc6-4de0-8fe3-e27e9c85dde2: found ../../dm-76
    uuid:by-uuid:253:69:ac4571df-05ca-42d3-93fd-edf1fb6d0825: found ../../dm-69
    uuid:by-uuid:253:77:56b4fe49-b00e-423f-aff4-aa3a88dc61cb: found ../../dm-77
    uuid:by-uuid:253:73:e41300f6-cc44-4c0c-83ec-f66530a16dea: found ../../dm-73
    uuid:by-uuid:253:67:0db3de5b-c079-4fc6-9c0d-03d113b0d1a5: found ../../dm-67
    uuid:by-uuid:253:70:b206ba47-adf2-437e-a6b8-ca724e2991b8: found ../../dm-70
    uuid:by-uuid:253:75:7f7ca776-c8b5-44ce-95b4-b37ad09fd1fb: found ../../dm-75
    blocksize:1048576
    data:yt1:/keep/yt1/yt1/
    data:yt2:/keep/yt2/yt2/
    data:yt3:/keep/yt3/yt3/
    data:yt4:/keep/yt4/yt4/
    data:yt5:/keep/yt5/yt5/
    data:yt6:/keep/yt6/yt6/
    data:yt7:/keep/yt7/yt7/
    data:yt8:/keep/yt8/yt8/
    data:yt9:/keep/yt9/yt9/
    data:yt10:/keep/yt10/yt10/
    mode:par3
    parity:/keep/.P1/parity1
    2-parity:/keep/.P2/parity2
    3-parity:/keep/.P3/parity3
    autosave:100000000000
    content:/keep/.conf/content0
    msg:progress: Loading state from /keep/.conf/content0...
    msg:verbose:   670504 files
    msg:verbose:        0 hardlinks
    msg:verbose:        5 symlinks
    msg:verbose:       51 empty dirs
    uuid:by-uuid:253:71:71dd89bc-809e-471a-a238-6c15acffbc47: found ../../dm-71
    uuid:by-uuid:253:68:c81715b0-2c74-4a3e-b0d3-6063f398ad2f: found ../../dm-68
    uuid:by-uuid:253:104:377fa249-1bbd-48f7-b129-1bd116a9b089: found ../../dm-104
    memory:used:1038497812
    memory:block:17
    memory:chunk:88
    memory:file:192
    memory:link:88
    memory:dir:80
    msg:progress: Using 990 MiB of memory for the FileSystem.
    msg:progress: Initializing...
    info_count:5187295
    info_time:1472073464:32168
    info_time:1472079944:807911
    info_time:1472158080:138905
    info_time:1472190752:121733
    info_time:1472244048:304619
    info_time:1472259280:261387
    info_time:1472512360:501141
    info_time:1472540512:706594
    info_time:1472623768:307965
    info_time:1472641800:86372
    info_time:1472673704:186635
    info_time:1472686512:67884
    info_time:1472754008:105950
    info_time:1472766008:174224
    info_time:1472840936:15338
    info_time:1472842920:121313
    info_time:1472848344:1447
    info_time:1472884520:141594
    info_time:1472903072:48806
    info_time:1472931592:1015660
    info_time:1472982296:3
    info_time:1472983176:3
    info_time:1472983456:10146
    info_time:1472993280:4682
    info_time:1472993528:1820
    info_time:1472994160:22995
    count_limit:518730
    time_limit:1472079944
    last_limit:486562
    msg:progress: Scrubbing...
    msg:progress: Using 128 MiB of memory for 8 blocks of IO cache.
    parity_error:1403579:parity: Data error, diff bits 4195609
    parity_error:1403579:2-parity: Data error, diff bits 4194090
    parity_error:1403579:3-parity: Data error, diff bits 4193621
    parity_error:1403580:parity: Data error, diff bits 4193228
    parity_error:1403580:2-parity: Data error, diff bits 4194518
    parity_error:1403580:3-parity: Data error, diff bits 4193371
    parity_error:1403581:parity: Data error, diff bits 4194959
    parity_error:1403581:2-parity: Data error, diff bits 4195543
    parity_error:1403581:3-parity: Data error, diff bits 4196127
    parity_error:1403582:parity: Data error, diff bits 4195335
    parity_error:1403582:2-parity: Data error, diff bits 4195089
    parity_error:1403582:3-parity: Data error, diff bits 4194808
    parity_error:1403583:parity: Data error, diff bits 4192132
    parity_error:1403583:2-parity: Data error, diff bits 4194947
    parity_error:1403583:3-parity: Data error, diff bits 4195469
    

    [Thousands of entries here skipped.]

    msg:fatal: 
    msg:fatal: Stopping for interruption at block 1406954
    sigint:1406954: SIGINT received
    msg:status: 
    msg:status:    10128 file errors
    msg:status:        0 io errors
    msg:status:        0 data errors
    msg:fatal: WARNING! Unexpected file errors!
    summary:error_file:10128
    summary:error_io:0
    summary:error_data:0
    summary:exit:error
    msg:progress: Saving state to /keep/.conf/content0...
    msg:progress: Saving state to /keep/.P1/content1...
    msg:progress: Saving state to /keep/.P2/content2...
    msg:progress: Saving state to /keep/.P3/content3...
    msg:verbose:   670504 files
    msg:verbose:        0 hardlinks
    msg:verbose:        5 symlinks
    msg:verbose:       51 empty dirs
    msg:progress: Verifying /keep/.conf/content0...
    msg:progress: Verifying /keep/.P1/content1...
    msg:progress: Verifying /keep/.P2/content2...
    msg:progress: Verifying /keep/.P3/content3...
    20160904-151802 result 0
    

    This means for me:

    I can no more scrub until the array is fully synced again, as the parity is in some state which makes it (partly) unusable for scrub.

    This, however, means, that I am not protected until the array is fully synced again, due to another observation:

    My data drives are connected over the network. However networks are unreliable. Hence from time to time some spurious read errors show up.

    Nothing to worry about if just a network switch was down for a while. So the solution to this is to do a retry later, when the network connectivity is restored.

    However SnapRAID records all problems and marks the blocks bad. Which is a good thing.
    In that case it is easy to recheck these wrong errors (at least I think this is the way to go) with snapraid -p bad scrub

    This always worked for me, the scrub never showed any errors.

    But in the current situation, if the parity is not fully rebuild and therefor the scrub is unusable, I cannot do this!

    FYI snapraid status:

    Using 990 MiB of memory for the FileSystem.
    SnapRAID status report:
    

    [snip]

    The oldest block was scrubbed 10 days ago, the median 5, the newest 0.
    
    WARNING! The array is NOT fully synced.
    You have a sync in progress at 2%.
    The 89% of the array is not scrubbed.
    No file has a zero sub-second timestamp.
    No rehash is in progress or needed.
    No error detected.
    

    PS: My version is 9.2-101-g7992ca1 which is just 1 commit less than 10.0. Built locally from git, running on Debian Jessie 64bit. Devices are nonlocal with ext4 on top of it.

     
    Last edit: Valentin Hilbig 2016-09-04
  • Leifi Plomeros

    Leifi Plomeros - 2016-09-04

    I think the relevant question is whether or not you can run fix to restore a lost data disk when sync -F has been interrupted?

     
    • rubylaser

      rubylaser - 2016-09-04

      I would agree. This seems to be the issue to me here. I have upgraded two to three parity arrays a couple of times with no issue.

       
    • Leifi Plomeros

      Leifi Plomeros - 2016-09-04

      I just did a little experiment to answer my own question.

      I created a test array with 2x10 GB data disks and 1 parity.
      I synced it completely.
      I added a second parity
      I ran snapraid sync -F and interrupted it (CTRL+C) at 52%
      I changed d1 to an empty folder on another disk in config.
      I ran snapraid fix -d d1

      Result: All files successfully restored on the new disk.

      I then tested to see what happened if I ran scrub -p 100 -o 0 to make a complete scrub in the 52 synced state.
      Snapraid now reported that I have thousands of file errors (no info about where they are located, since I did not use -v)

      I then ran sync and allowed it to complete.

      And finally I ran scrub -p 100 -o 0 again and did not find any errors.

      It's a bit unfortunate that I did not use -v when I tested scrub, but in any case it seems that snapraid can fix lost disks even if sync -F has been interrupted and that the data is indeed protected during the process.

      Edit: Forgot to mention. I used snapraid v11.0-beta-36-g56c129f for the test. But I don't think that there is any new functionality in v11 that is missing in v9.0 relevant to test.

       
      Last edit: Leifi Plomeros 2016-09-04
      • Leifi Plomeros

        Leifi Plomeros - 2016-09-04

        I went back and redid the test regarding scrub and interrupted sync -F -v -l log.txt to see where the scrub reported errors.
        All errors originated only from the new parity disk, starting at the block following the interruption.

        I guess that means that something bad has indeed happened to the original parity disks in the threadstart... But whatever it was it does not seem to be the result of the interruption itself.

         
        Last edit: Leifi Plomeros 2016-09-04
  • Valentin Hilbig

    Valentin Hilbig - 2016-09-14

    Thanks and sorry that I cannot contribute more to this thing except finding issues ;)

    My scrub completed, too, I now have 3 parities and everything is back to normal.

    An observation is now, that the fragments have changed dramatically. Before the array had thousands of fragments, after adding the parity they dropped to 0. No fragments anymore on the old drives! Which makes me a bit nervous ;)

    As this tells me (but I am not sure) that all the parities were completely recalculated. So perhaps the problem stems from a recalculation of the blocks (mapping files to the blocks) due to a sync -F. When the blocks are recalculated (which eliminates fragments) this means for me, that the parity blocks afterwards belong to a different piece in the files, hence the parity will be "corrupt".

    This however would mean, that you cannot restore the original files. You will restore something else (with the correct checksums of the fragments in the files, but not the correct checksums of the complete file).

    So I am not convinced that only the scrub is problematic until sync -F has completed. I suspect that there lies a hidden pitfall in case of files being fragmented and the fragments were "shifted".

    Can you perhaps check this?

    Create an array with 1 parity and 2 disks and several GB of data and several 100 files. Note that the files' content should come from /dev/urandom and definitively not from /dev/zero.
    Then on one of the disks alter the files in sizes, such that a high fragment count will be observed after fully synced.
    Add a 2nd Parity, scrub -F, interrupt.

    Now drop the non-fragmented disk and try to repair it from parity and make sure (using md5sum etc.) that the data still is correct afterwards.

    Repeat the above, but this time drop the fragmented disk.

    Repeat the above, but now do fragments on both disks and drop one.

    If in all variants a failed disk can be recreated successfully and correctly, then we are probably safe.

    Another observation:

    The array claims to be 100% not scrubbed after adding the parity, because there is "nothing to do" as there are no old blocks (everything is new to it).

    This is not what I would expect (but do not take this as critique, just as a personal suggestion) when adding a parity. Here is what I would expect:

    The new parity is created and filled. For this the complete array must be re-read of course. But nothing else is changed. So the fragments are not re-balanced.

    While doing the add of the parity, the blocks should be scrubbed using the existing parities to find silent corruptions before adding a block to the new parity. This means, after the parity add finishes, the array comes out 100% scrubbed (instead of 0% scrubbed).

    Lessons learned, at least for me:

    When adding a 4th parity, I probably will buy 4 drives, copy the contents-Files to a new snapraid with 4 parities over the data disks. After the new snapraid is finished, so both snapraids are synced and scrubbed, I drop the old snapraid with just the 3 parities. Then I add them as 3 more data disks to the new Snapraid array.

    Hopefully this is solved when I am at the point where I need 5 parities, because adding 4 more data disks this way is a bit too much for my taste ;)

    Thanks!
    I would wish I can help, but currently I do not have the capacity to do so.

    -Tino
    Edits: typos

     
    Last edit: Valentin Hilbig 2016-09-14
  • Andrea Mazzoleni

    Hi Valentin,

    Yep. What you found is strange. The "sync -F" is not expected to reorganize files inside the parity, and the fragmentation is not expected to change.

    I'll check it, but not next week. I'll be in holiday :)

    Ciao,
    Andrea

     
  • Valentin Hilbig

    Valentin Hilbig - 2016-10-13

    For the benefit of the readers:

    Long story cut short:

    Be careful when using "snapraid -F sync" with SnapRaid 10 and before.
    However, in the next version of SnapRaid the problem reported here is fixed.

    Full story:

    Looking at GitHub I found a fix for what was reported here:

    https://github.com/amadvance/snapraid/commit/b1fc75adab7f76db1c3c7e37bc29dc9be798fd0a#diff-5f90db49575df97c22c4bc7e4b3921b8L2019

    What impresses me is, how small this fix looks. It not only fixes -F, it also introduces a new option -R which does again, what I observed, because removing fragmentation is an interesting option.

    Many thanks for this fix!

    For those not able to understand the code, here is how I understand what happened (please correct me if I'm wrong):

    In SnapRaid 10 option -F does a recalculation of parity which also eliminates fragmentation. I am not sure if it only does this when adding an additional parity (like in my case) or always when using -F, but the problem with this recalculation of fragmentation is, that while the recalculation is done, we do not have full parity protection. Please note that, AFAICS, silent data corruption can still be detected with -F, thanks to the hashes. However it is likely you cannot repair this corruption, even with enough parities, because the parities are mostly in some unusable state (because the mapping of blocks to parities was changed).

    Loosing repair apparently is not what we want in the case when adding an additional parity.

    However, having a way to eliminate fragmentation is a good thing, too. It is better than to remove the parity files alltogether and then re-create them, because in this case we loose precious hash information which protects against silent data corruption (there is a second one in the contents files, but you need to "check" them, which takes very long).

    So now, with the next version of Snapraid (11?) we will have both:

    Option -F which just "fixes" parity (as it ought to do), but keeps the mapping of blocks the same. This is the correct way when adding more parity drives. This will not change fragmentation.

    A new option -R does a complete parity recomputation. This also eliminates fragmentation while detecting silent data corruption. However, you loose the redundancy while doing this, so you cannot repair the silent data corruption other than restoring the file from backup.

    Some words about "fragmentation".

    If you are on Microsoft Windows you are used to "defrag" your drives frequently. "Fragmentation" of SnapRaid is something completely different. It tells you, that the parity data is not linearily allocated with the linearity of the files. Fragmentation slows down some operations. But with all those caching behind the scenes you probably do not notice that at all. I am not sure, but there might be some additional storage needs due to fagmentation, but even if, this will be small compared to the huge size of everything else. There is no real need to "defragment" a SnapRaid, even if you have 10k of fragments. So you probably never need option -R ever and never have to worry about your redundancy.

    Please note that you will only see fragmentation if you delete files, such that SnapRaid needs to fill "holes" in the mapping from files to parity. Correctly deleting files in SnapRaid currently needs quite some discipline and should not be taken easy!

    Deletion should be done as follows:

    • Only do deletions on a single data drive at a time. If you need to delete on more than one drive, do this in multiple cycles, one drive at a time.
    • Move all files to be delete to a special holding area on the same drive, an area which is not seen by SnapRaid.
    • snapraid sync
    • make sure that "snapraid status" and "snapraid diff" do not find anything. If paranoid, use "snapraid -d DRIVE -a check" and if very paranoid use "snapraid -p full scrub" to be sure, everything is OK
    • Delete the files in the holding area but only if everything is OK.

    However: If something breaks somewhere, use option -i to specify this special holding area while doing repairs, unless the failed drive is the one with the special holding area of course. ;)

    If you do not follow this recipe, you might loose as many redundancy as you have drives with deleted files. You might be lucky and the corruption can still be corrected, because it happens not not need the deleted files. But you cannot be sure, and as usual, Murphy's Law will apply.

    My way to handle deletes is to add another additional redundancy (one more parity as I really need) and use this as the special automatic "holding area". This way I can delete files (on a single drive, only) and then immediately do a "snapraid sync".

    This even can be an automated "delete" script:

    • Sort the list of files to delete by the drives the files are on.
    • As long as the list is not empty:
    • Choose a drive on the list
    • Delete all files on the list which are on this drive
      • remove the removed files from the list as well
    • snapraid sync

    Note that you can even extend the list while this deletion algorithm is running. And as long as you atomically add files to the devices you can even do this while "snapraid sync" runs. The only thing NOT to do, ever, is to alter EXISTING files while "snapraid sync" runs. You will see some very strange messages then.

     

Log in to post a comment.