This was just the tool I was looking for :-). I didn't want to wipe and start again, most of the data is large video, and my drives are different sizes. So this looks like just the option for me. I rsync backup "important" files, and was resigned to losing my VIDEO rips, and having to re rip, or wait for recorded shows to repeat. But this looks like the perfect solution compromise to unprotected/full mirroring/RAID etc.
I have a reasonably large setup, 5 drives built up over the years, 1.5/2/2/3/4TB Drives.
I have 2.5 Tb VIDEO pretty static and 3.5 TB of TV recorded (Mythtv) which will be added and removed from, but say 10-20 files a week.
Question to those that know:
Is there best practice to reduce the sync workload ? For the parity computation, does it make sense to load balance the VIDEO across the drives, then sync, so that is a static relationship, with all the VIDEO parity checked against itself and stable. So shift the existing VIDEO around to ~ 0.6 TB per disk, and then sync, before adding the TV into the sync ?
Similarly is having the changing TV on 1 drive or spread amongst the disks best practice.
Or if I set the 3 or 4TB drive as the parity drive, aim to fill the 1.5/2/2 and then only change the "rest" of the 3 or 4 TB drive, is that the best strategy ....
If I use say the 4GB drive as a parity + unprotected data drive, do I partition it to stop the parity file being "fragmented" ..
The flexibility is fantastic, and I love the fact all of these would "work" but just looking for best practice advise.
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There is very little you can do to force snapraid sharing parity for certain files. Sure you can do the initial sync the way you described and that would make the video files share parity only amongst each other.
But when you two weeks down the road add a single new movie then there is no way for you to controll which files it should share parity with. Or when you later add an episode of a tv-series you can't ensure that it will not be sharing parity with any movie.
As time goes by you will surely end up with internal snapraid parity fragmentation, but I don't think I have ever seen anyone report any amount of high enough fragmentation to have a noticeable impact on anything.
As for benefits of workload balancing. Does it really make any difference if the daily/weekly/random sync takes 3, 6, 12 or 20 minutes? For reading purposes it is unlikely that you will ever benefit from files being spread out as the files will be read in sequence.
I would recommend that you make things as easy as possible by doing almost the opposite of what you had in mind.
Put the parity file on the 4 TB disk
Put the video on the 3 TB disk
Put the movies on the 2+2 TB disks
Use the 1.5 TB disk (or a small part of it) as a temp-disk which is not protected by snapraid.
On the tempdisk you put downloads, files that might be modified and other things which you expect to have a short life span and high probabilty of being deleted (and doesn't really need any protection at all). On the other disks you put everything that you care about and expect to keep for a long time and therefore want to be protected.
Last edit: Leifi Plomeros 2015-07-26
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, thanks for the advice. "Just let it do its job" works for me :-).
I wanted to avoid doing something that lead to unnecessary disks thrashing, What I might do rather than have 2+2+3 protected by *(3 of 4TB). ( so 7TB data 3TB parity + 2.5 unprotected) go for 1.5+2+2+ (2 of the 3TB) protected by 2TB on the 4TB, that gives me 7.5 protected + 3 TB unprotected, but that gives me 4 data disks rather than 3. Is that just a matter of preference ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Just put the parity file on a separate partition to avoid that it becomes fragmented on file system level.
Also make sure that the partition where the parity file resides is at least the same size as the largest data disk (preferably make the parity partition a tiny bit larger).
Personally I would have done it like this:
Merge/span/stripe/raid0 the 1,5 TB + 3 TB disks into one big volume.
Do the same for the 2x2 TB disks into one 4 TB volume.
Result:
Data1: 4 TB Partition on the 4.5 TB volume
Data2: 4 TB from the 2x2 TB disks
Parity: 4 TB single disk
Unprotected: 0.5 TB Partition on the 4.5 TB volume
Thanks again, I'll go and read about options to Merge/span/stripe/raid0 :-)
For the moment I will go with your original suggestion but thanks for the clarification on a seperate partition for the parity file, thats the sort of thing I was worried about.
Last edit: John Reid 2015-07-26
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
This was just the tool I was looking for :-). I didn't want to wipe and start again, most of the data is large video, and my drives are different sizes. So this looks like just the option for me. I rsync backup "important" files, and was resigned to losing my VIDEO rips, and having to re rip, or wait for recorded shows to repeat. But this looks like the perfect solution compromise to unprotected/full mirroring/RAID etc.
I have a reasonably large setup, 5 drives built up over the years, 1.5/2/2/3/4TB Drives.
I have 2.5 Tb VIDEO pretty static and 3.5 TB of TV recorded (Mythtv) which will be added and removed from, but say 10-20 files a week.
Question to those that know:
Is there best practice to reduce the sync workload ? For the parity computation, does it make sense to load balance the VIDEO across the drives, then sync, so that is a static relationship, with all the VIDEO parity checked against itself and stable. So shift the existing VIDEO around to ~ 0.6 TB per disk, and then sync, before adding the TV into the sync ?
Similarly is having the changing TV on 1 drive or spread amongst the disks best practice.
Or if I set the 3 or 4TB drive as the parity drive, aim to fill the 1.5/2/2 and then only change the "rest" of the 3 or 4 TB drive, is that the best strategy ....
If I use say the 4GB drive as a parity + unprotected data drive, do I partition it to stop the parity file being "fragmented" ..
The flexibility is fantastic, and I love the fact all of these would "work" but just looking for best practice advise.
Thanks
There is very little you can do to force snapraid sharing parity for certain files. Sure you can do the initial sync the way you described and that would make the video files share parity only amongst each other.
But when you two weeks down the road add a single new movie then there is no way for you to controll which files it should share parity with. Or when you later add an episode of a tv-series you can't ensure that it will not be sharing parity with any movie.
As time goes by you will surely end up with internal snapraid parity fragmentation, but I don't think I have ever seen anyone report any amount of high enough fragmentation to have a noticeable impact on anything.
As for benefits of workload balancing. Does it really make any difference if the daily/weekly/random sync takes 3, 6, 12 or 20 minutes? For reading purposes it is unlikely that you will ever benefit from files being spread out as the files will be read in sequence.
I would recommend that you make things as easy as possible by doing almost the opposite of what you had in mind.
Put the parity file on the 4 TB disk
Put the video on the 3 TB disk
Put the movies on the 2+2 TB disks
Use the 1.5 TB disk (or a small part of it) as a temp-disk which is not protected by snapraid.
On the tempdisk you put downloads, files that might be modified and other things which you expect to have a short life span and high probabilty of being deleted (and doesn't really need any protection at all). On the other disks you put everything that you care about and expect to keep for a long time and therefore want to be protected.
Last edit: Leifi Plomeros 2015-07-26
Hi, thanks for the advice. "Just let it do its job" works for me :-).
I wanted to avoid doing something that lead to unnecessary disks thrashing, What I might do rather than have 2+2+3 protected by *(3 of 4TB). ( so 7TB data 3TB parity + 2.5 unprotected) go for 1.5+2+2+ (2 of the 3TB) protected by 2TB on the 4TB, that gives me 7.5 protected + 3 TB unprotected, but that gives me 4 data disks rather than 3. Is that just a matter of preference ?
Yes can do it any way you prefer.
Just put the parity file on a separate partition to avoid that it becomes fragmented on file system level.
Also make sure that the partition where the parity file resides is at least the same size as the largest data disk (preferably make the parity partition a tiny bit larger).
Personally I would have done it like this:
Merge/span/stripe/raid0 the 1,5 TB + 3 TB disks into one big volume.
Do the same for the 2x2 TB disks into one 4 TB volume.
Result:
Data1: 4 TB Partition on the 4.5 TB volume
Data2: 4 TB from the 2x2 TB disks
Parity: 4 TB single disk
Unprotected: 0.5 TB Partition on the 4.5 TB volume
8 TB protected data, 0.5 TB unprotected and 4 TB parity.
Pretty much any thing works as long as you remember to have parity of at least same size as the largest data disk.
Thanks again, I'll go and read about options to Merge/span/stripe/raid0 :-)
For the moment I will go with your original suggestion but thanks for the clarification on a seperate partition for the parity file, thats the sort of thing I was worried about.
Last edit: John Reid 2015-07-26