Understanding scrub finish and status screen. I really love this program and the abilities it has whether using the limited GUI "But helpful!!!" or using commands. But what really allude me is. In my case:
5x 4TB Data drives 18.2TB
2x 4TB Parity drives 7.28TB
How does all the data fit on those two parity drives?
On to the next and reasoning behind the thread.
Understanding the information from finishing scrub and information from status. Thanks everyone for there input in advance! :-D
To understand the concept of parity it is easiest to look at a 3 disk setup with 2 data disks and a single parity disk in RAID4,
On each data disk all data is stored as a long sequence of 0 and 1.
Like this:
Disk1
Disk2
0
0
0
1
1
0
1
1
Then you add a parity disk which uses an XOR ruleset to determine if the parity should be 1 or 0 like this:. 0+0=0, 0+1=1, 1+0=1, 1+1=0
Disk1
Disk2
ParityDisk
0
0
0
0
1
1
1
0
1
1
1
0
At this point if someone removes Disk1, Disk2 or Parity you can use the XOR ruleset to figure out what is missing.
Disk1
Disk2
ParityDisk
0
x
0
0
x
1
1
x
1
1
x
0
In the first row the missing number must be 0 since 0+0=0 would be correct and 0+1=0 would be incorrect.
In the next row the missing number must be 1 since 0+1=1 would be correct and 0+0=1 would be incorrect.
So all you need to do is repeat this logic a million times and you have restored 128 kilobytes of missing data.
You can expand this exmple to have hundreds of data disks. As long as you only lose a single disk you can always figure out the lost numbers with help of the parity number.
For double parity it is not so simple. Instead of looking at all data as 0 or 1 you look at it as numbers from 0 - 255 and use a much more complicated ruleset than 0+0=0, 0+1=1, 1+0=1, 1+1=0 to figure out what is missing.
As an end user you don't have have to understand exactly how it is done (I don't). Instead the important part is to understand that as long as you have at least as many parity disks as lost disks it will be possible for snapraid to figure out what is missing, BUT if you lose a higher number of disks than you have parity none of the lost data can be reconstructed.
What scrub does is that it looks at a portion of the data and parity and verify that nothing is wrong. If it finds something wrong then it will let you know and recommend that you use the fix function to fix it.
So basically the important thing to look for in the scrub result is the [Everything OK] message. If you don't have that message near the end, then instead there would be error messages that you need to read.
The line d1 58% let's you know that d1 was the slowest disk. In 58% of the scrub the other disks had to wait for d1 to complete before they could continue.
In the status message you find mostly statistical information such as how full your disks are, how much parity "waste" you have on each disk and how long ago it was since you scrubbed things, and a warning message if there are some errors from a previously aborted operation.
The waste concept is not very intuitive to understand so I will not try to explain it detail right now. Instead you only need to understand that the parity file needs to be a tiny bit larger than the amount of data on any single data disk. Which can be translated into: You need to leave a little bit of free space on any data disk which is of the same size as the parity disks. The waste column in status gives you an indication of how much free space is needed on each disk.
Last edit: Leifi Plomeros 2016-03-20
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry for not responding back on your AWESOME post. But I'll need to get back with you because it seems my waste colum has just shot up to 6TB. And this is after balancing the drives a bit more and deleting unwanted videos and pictures from the pool. I'm going to research and see if I did something wrong.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
6 TB sounds extreme.
Typically you want to avoid doing exactly what you just described.
Preferably you should delete files by first moving them to trashcan (or a folder outside the array), then sync and when sync is complete permanetly delete the files.
When moving from one disk to another it is safest to first copy from one disk to the other, sync and delete and then sync again, and try to do it one disk at a time.
If you just move the files and sync, and the target disk dies during sync you may end up in a situation where the files are removed from parity on the source disk and not yet synced to parity on the target disk (which is dead), resulting in not being able to recover the moved files even if you have several parity disks.
Since you have already moved around, the second best option is to sync using the option -h which will at least verify that the target files are OK before removing them from parity. But it takes double time to sync with that option.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Again so thankful for this awesome community! So do you think I I should just not sync or do anything with the Drivepool till everything is done then redo the snapraid?
By the way when Stablebit Drivepool combines all the drives, the main directory has only one SnapRAID.content . Will that effect anything?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You shouldn't have to redo the snapraid.
Just make sure to disable relevant balancing options in drivepool so that it does not move files around on it's own from disk to disk.
And then run snapraid sync -h
As long as you have defined the individual data disks in snapraid.conf file it makes no difference that drivepool shows the contents as one big disk.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm going to piggy back off of this a little and ask if doing the sync -h is viable option to run all the time? Reason I ask is that I appear to have a similar setup to the OP, however I delete files all the time with better quality versions of the same media. Being that I'm not moving files to an off array Recycle Bin before syncing/deleting, should I be running the -h option every night then?
What are the plus and minus of the -h option?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm assuming that using sync -h would do more wear and tear on the drives. But I'm also thinking the same thing than have to do the
"Preferably you should delete files by first moving them to trashcan (or a folder outside the array), then sync and when sync is complete permanetly delete the files.
When moving from one disk to another it is safest to first copy from one disk to the other, sync and delete and then sync again, and try to do it one disk at a time."
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you are not worried about something bad happening during the sync, then you can skip the option -h.
Rebuilding the entire array from scratch would also cause significant wear on the drives.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So just a few more things to ask and I think I might be fine.
On the scrub finish, most likely sync finish too;
After the d0-X insert theres parity %, 2-parity %, raid%, hash %, sched %, and misc %.
Could anyone please explain these? I got the gist of the d0-X % meaning from the first post. So please if nobody minds I would like to be educated :-) Andrea if this was in the manual sorry I thought I read it all :-S
-Aaron
Last edit: 1Geekyp3rson 2016-03-22
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So just a few more things to ask and I think I might be fine.
On the scrub finish, most likely sync finish too;
After the d0-X insert theres parity %, 2-parity %, raid%, hash %, sched %, and misc %.
Could anyone please explain these? I got the gist of the d0-X % meaning from the first post. So please if nobody minds I would like to be educated :-) Andrea if this was in the manual sorry I thought I read it all :-S
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
They are just relative statistics that can help you identify if you have a bottleneck which is caused by something else than the speed of the data disks or parity disks.
As long as they are low they are not very interesting at all.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Understanding scrub finish and status screen. I really love this program and the abilities it has whether using the limited GUI "But helpful!!!" or using commands. But what really allude me is. In my case:
5x 4TB Data drives 18.2TB
2x 4TB Parity drives 7.28TB
How does all the data fit on those two parity drives?
On to the next and reasoning behind the thread.
Understanding the information from finishing scrub and information from status. Thanks everyone for there input in advance! :-D
http://postimg.org/image/cckcc51j3
http://postimg.org/image/e8llrtobv
Last edit: 1Geekyp3rson 2016-03-20
To understand the concept of parity it is easiest to look at a 3 disk setup with 2 data disks and a single parity disk in RAID4,
On each data disk all data is stored as a long sequence of 0 and 1.
Like this:
Then you add a parity disk which uses an XOR ruleset to determine if the parity should be 1 or 0 like this:. 0+0=0, 0+1=1, 1+0=1, 1+1=0
At this point if someone removes Disk1, Disk2 or Parity you can use the XOR ruleset to figure out what is missing.
In the first row the missing number must be 0 since 0+0=0 would be correct and 0+1=0 would be incorrect.
In the next row the missing number must be 1 since 0+1=1 would be correct and 0+0=1 would be incorrect.
So all you need to do is repeat this logic a million times and you have restored 128 kilobytes of missing data.
You can expand this exmple to have hundreds of data disks. As long as you only lose a single disk you can always figure out the lost numbers with help of the parity number.
For double parity it is not so simple. Instead of looking at all data as 0 or 1 you look at it as numbers from 0 - 255 and use a much more complicated ruleset than 0+0=0, 0+1=1, 1+0=1, 1+1=0 to figure out what is missing.
As an end user you don't have have to understand exactly how it is done (I don't). Instead the important part is to understand that as long as you have at least as many parity disks as lost disks it will be possible for snapraid to figure out what is missing, BUT if you lose a higher number of disks than you have parity none of the lost data can be reconstructed.
What scrub does is that it looks at a portion of the data and parity and verify that nothing is wrong. If it finds something wrong then it will let you know and recommend that you use the fix function to fix it.
So basically the important thing to look for in the scrub result is the [Everything OK] message. If you don't have that message near the end, then instead there would be error messages that you need to read.
The line d1 58% let's you know that d1 was the slowest disk. In 58% of the scrub the other disks had to wait for d1 to complete before they could continue.
In the status message you find mostly statistical information such as how full your disks are, how much parity "waste" you have on each disk and how long ago it was since you scrubbed things, and a warning message if there are some errors from a previously aborted operation.
The waste concept is not very intuitive to understand so I will not try to explain it detail right now. Instead you only need to understand that the parity file needs to be a tiny bit larger than the amount of data on any single data disk. Which can be translated into: You need to leave a little bit of free space on any data disk which is of the same size as the parity disks. The waste column in status gives you an indication of how much free space is needed on each disk.
Last edit: Leifi Plomeros 2016-03-20
Sorry for not responding back on your AWESOME post. But I'll need to get back with you because it seems my waste colum has just shot up to 6TB. And this is after balancing the drives a bit more and deleting unwanted videos and pictures from the pool. I'm going to research and see if I did something wrong.
6 TB sounds extreme.
Typically you want to avoid doing exactly what you just described.
Preferably you should delete files by first moving them to trashcan (or a folder outside the array), then sync and when sync is complete permanetly delete the files.
When moving from one disk to another it is safest to first copy from one disk to the other, sync and delete and then sync again, and try to do it one disk at a time.
If you just move the files and sync, and the target disk dies during sync you may end up in a situation where the files are removed from parity on the source disk and not yet synced to parity on the target disk (which is dead), resulting in not being able to recover the moved files even if you have several parity disks.
Since you have already moved around, the second best option is to sync using the option -h which will at least verify that the target files are OK before removing them from parity. But it takes double time to sync with that option.
Again so thankful for this awesome community! So do you think I I should just not sync or do anything with the Drivepool till everything is done then redo the snapraid?
By the way when Stablebit Drivepool combines all the drives, the main directory has only one SnapRAID.content . Will that effect anything?
You shouldn't have to redo the snapraid.
Just make sure to disable relevant balancing options in drivepool so that it does not move files around on it's own from disk to disk.
And then run snapraid sync -h
As long as you have defined the individual data disks in snapraid.conf file it makes no difference that drivepool shows the contents as one big disk.
I'm going to piggy back off of this a little and ask if doing the sync -h is viable option to run all the time? Reason I ask is that I appear to have a similar setup to the OP, however I delete files all the time with better quality versions of the same media. Being that I'm not moving files to an off array Recycle Bin before syncing/deleting, should I be running the -h option every night then?
What are the plus and minus of the -h option?
I'm assuming that using sync -h would do more wear and tear on the drives. But I'm also thinking the same thing than have to do the
"Preferably you should delete files by first moving them to trashcan (or a folder outside the array), then sync and when sync is complete permanetly delete the files.
When moving from one disk to another it is safest to first copy from one disk to the other, sync and delete and then sync again, and try to do it one disk at a time."
If you are not worried about something bad happening during the sync, then you can skip the option -h.
Rebuilding the entire array from scratch would also cause significant wear on the drives.
So just a few more things to ask and I think I might be fine.
On the scrub finish, most likely sync finish too;
After the d0-X insert theres parity %, 2-parity %, raid%, hash %, sched %, and misc %.
Could anyone please explain these? I got the gist of the d0-X % meaning from the first post. So please if nobody minds I would like to be educated :-) Andrea if this was in the manual sorry I thought I read it all :-S
-Aaron
Last edit: 1Geekyp3rson 2016-03-22
So just a few more things to ask and I think I might be fine.
On the scrub finish, most likely sync finish too;
After the d0-X insert theres parity %, 2-parity %, raid%, hash %, sched %, and misc %.
Could anyone please explain these? I got the gist of the d0-X % meaning from the first post. So please if nobody minds I would like to be educated :-) Andrea if this was in the manual sorry I thought I read it all :-S
They are just relative statistics that can help you identify if you have a bottleneck which is caused by something else than the speed of the data disks or parity disks.
As long as they are low they are not very interesting at all.
Thanks again for the information Leifi!