Menu

Understanding scrub finish and status screen

Help
2016-03-20
2016-03-27
  • 1Geekyp3rson

    1Geekyp3rson - 2016-03-20

    Understanding scrub finish and status screen. I really love this program and the abilities it has whether using the limited GUI "But helpful!!!" or using commands. But what really allude me is. In my case:

    5x 4TB Data drives 18.2TB

    2x 4TB Parity drives 7.28TB

    How does all the data fit on those two parity drives?

    On to the next and reasoning behind the thread.

    Understanding the information from finishing scrub and information from status. Thanks everyone for there input in advance! :-D

    http://postimg.org/image/cckcc51j3

    http://postimg.org/image/e8llrtobv

     

    Last edit: 1Geekyp3rson 2016-03-20
  • Leifi Plomeros

    Leifi Plomeros - 2016-03-20

    To understand the concept of parity it is easiest to look at a 3 disk setup with 2 data disks and a single parity disk in RAID4,

    On each data disk all data is stored as a long sequence of 0 and 1.
    Like this:

    Disk1 Disk2
    0 0
    0 1
    1 0
    1 1

    Then you add a parity disk which uses an XOR ruleset to determine if the parity should be 1 or 0 like this:. 0+0=0, 0+1=1, 1+0=1, 1+1=0

    Disk1 Disk2 ParityDisk
    0 0 0
    0 1 1
    1 0 1
    1 1 0

    At this point if someone removes Disk1, Disk2 or Parity you can use the XOR ruleset to figure out what is missing.

    Disk1 Disk2 ParityDisk
    0 x 0
    0 x 1
    1 x 1
    1 x 0

    In the first row the missing number must be 0 since 0+0=0 would be correct and 0+1=0 would be incorrect.
    In the next row the missing number must be 1 since 0+1=1 would be correct and 0+0=1 would be incorrect.
    So all you need to do is repeat this logic a million times and you have restored 128 kilobytes of missing data.

    You can expand this exmple to have hundreds of data disks. As long as you only lose a single disk you can always figure out the lost numbers with help of the parity number.

    For double parity it is not so simple. Instead of looking at all data as 0 or 1 you look at it as numbers from 0 - 255 and use a much more complicated ruleset than 0+0=0, 0+1=1, 1+0=1, 1+1=0 to figure out what is missing.

    As an end user you don't have have to understand exactly how it is done (I don't). Instead the important part is to understand that as long as you have at least as many parity disks as lost disks it will be possible for snapraid to figure out what is missing, BUT if you lose a higher number of disks than you have parity none of the lost data can be reconstructed.

    What scrub does is that it looks at a portion of the data and parity and verify that nothing is wrong. If it finds something wrong then it will let you know and recommend that you use the fix function to fix it.

    So basically the important thing to look for in the scrub result is the [Everything OK] message. If you don't have that message near the end, then instead there would be error messages that you need to read.

    The line d1 58% let's you know that d1 was the slowest disk. In 58% of the scrub the other disks had to wait for d1 to complete before they could continue.

    In the status message you find mostly statistical information such as how full your disks are, how much parity "waste" you have on each disk and how long ago it was since you scrubbed things, and a warning message if there are some errors from a previously aborted operation.

    The waste concept is not very intuitive to understand so I will not try to explain it detail right now. Instead you only need to understand that the parity file needs to be a tiny bit larger than the amount of data on any single data disk. Which can be translated into: You need to leave a little bit of free space on any data disk which is of the same size as the parity disks. The waste column in status gives you an indication of how much free space is needed on each disk.

     

    Last edit: Leifi Plomeros 2016-03-20
  • 1Geekyp3rson

    1Geekyp3rson - 2016-03-21

    Sorry for not responding back on your AWESOME post. But I'll need to get back with you because it seems my waste colum has just shot up to 6TB. And this is after balancing the drives a bit more and deleting unwanted videos and pictures from the pool. I'm going to research and see if I did something wrong.

     
  • Leifi Plomeros

    Leifi Plomeros - 2016-03-21

    6 TB sounds extreme.
    Typically you want to avoid doing exactly what you just described.

    Preferably you should delete files by first moving them to trashcan (or a folder outside the array), then sync and when sync is complete permanetly delete the files.

    When moving from one disk to another it is safest to first copy from one disk to the other, sync and delete and then sync again, and try to do it one disk at a time.

    If you just move the files and sync, and the target disk dies during sync you may end up in a situation where the files are removed from parity on the source disk and not yet synced to parity on the target disk (which is dead), resulting in not being able to recover the moved files even if you have several parity disks.

    Since you have already moved around, the second best option is to sync using the option -h which will at least verify that the target files are OK before removing them from parity. But it takes double time to sync with that option.

     
  • 1Geekyp3rson

    1Geekyp3rson - 2016-03-21

    Again so thankful for this awesome community! So do you think I I should just not sync or do anything with the Drivepool till everything is done then redo the snapraid?

    By the way when Stablebit Drivepool combines all the drives, the main directory has only one SnapRAID.content . Will that effect anything?

     
  • Leifi Plomeros

    Leifi Plomeros - 2016-03-21

    You shouldn't have to redo the snapraid.
    Just make sure to disable relevant balancing options in drivepool so that it does not move files around on it's own from disk to disk.
    And then run snapraid sync -h
    As long as you have defined the individual data disks in snapraid.conf file it makes no difference that drivepool shows the contents as one big disk.

     
  • WarmongerX

    WarmongerX - 2016-03-21

    I'm going to piggy back off of this a little and ask if doing the sync -h is viable option to run all the time? Reason I ask is that I appear to have a similar setup to the OP, however I delete files all the time with better quality versions of the same media. Being that I'm not moving files to an off array Recycle Bin before syncing/deleting, should I be running the -h option every night then?

    What are the plus and minus of the -h option?

     
    • 1Geekyp3rson

      1Geekyp3rson - 2016-03-22

      I'm assuming that using sync -h would do more wear and tear on the drives. But I'm also thinking the same thing than have to do the

      "Preferably you should delete files by first moving them to trashcan (or a folder outside the array), then sync and when sync is complete permanetly delete the files.
      When moving from one disk to another it is safest to first copy from one disk to the other, sync and delete and then sync again, and try to do it one disk at a time."

       
      • Leifi Plomeros

        Leifi Plomeros - 2016-03-22

        If you are not worried about something bad happening during the sync, then you can skip the option -h.
        Rebuilding the entire array from scratch would also cause significant wear on the drives.

         
  • 1Geekyp3rson

    1Geekyp3rson - 2016-03-22

    So just a few more things to ask and I think I might be fine.

    On the scrub finish, most likely sync finish too;
    After the d0-X insert theres parity %, 2-parity %, raid%, hash %, sched %, and misc %.

    Could anyone please explain these? I got the gist of the d0-X % meaning from the first post. So please if nobody minds I would like to be educated :-) Andrea if this was in the manual sorry I thought I read it all :-S

    -Aaron

     

    Last edit: 1Geekyp3rson 2016-03-22
  • 1Geekyp3rson

    1Geekyp3rson - 2016-03-25

    So just a few more things to ask and I think I might be fine.
    On the scrub finish, most likely sync finish too;
    After the d0-X insert theres parity %, 2-parity %, raid%, hash %, sched %, and misc %.
    Could anyone please explain these? I got the gist of the d0-X % meaning from the first post. So please if nobody minds I would like to be educated :-) Andrea if this was in the manual sorry I thought I read it all :-S

     
  • Leifi Plomeros

    Leifi Plomeros - 2016-03-25

    They are just relative statistics that can help you identify if you have a bottleneck which is caused by something else than the speed of the data disks or parity disks.
    As long as they are low they are not very interesting at all.

     
  • 1Geekyp3rson

    1Geekyp3rson - 2016-03-27

    Thanks again for the information Leifi!

     

Log in to post a comment.

MongoDB Logo MongoDB