Ok, I've searched high and low for information on how to do thins but after two whole days i admint myself defeated and find myself in need to ask for help. I am new to both snapraid and the operating system ubuntu, so forgive me if this post seems to be written by someone who just got a computer but I need to be assured that my thoughts are correct before I do something that will corrupt everything.
The situation:
I've just completed building new computer that will serve as an all-round media and backup server for my whole family. I have 6x2Tb + 3x1Tb drives. (+ 1x120Gb SSD as OS-drive)
I want to make these two "collections" into two separate arrays like this:
Array 1:
2Tb - parity
2Tb - parity2
2Tb - content
2Tb - content... and so on
(this will be expanded in the future, therefore two paritys)
Array 2:
1Tb - parity
1Tb - content
1Tb - content
The 2Tb-array will be loaded with movies and other heavy files that will not change over time and will be on a slow sync schedule.
The 1Tb array will contain a lot of smaller files, like music and documents which are prone to change more often than movie files and will have a more frequent sync schedule.
If I understand the documentation correctly I have to create a configuration file called snapraid.conf and place it inside the "/etc/" folder. This file will hold information about the array and which drives that are part of it and where they are located (mounted).
When I run the sync command for the first time the snapraid program will place a ".parity"-file on the parity drives. This file is just a bunch of data it needs to be able to rebuild the other drives in case of failure.
So far so good, but this is where I become uncertain of what is going on
The program will also scatter a ".content"-file to the places of your choosing (you need a couple of these so you don't lose the only copy in case the drive it was placed on dies). This ".content"-file is some sort of index of all the files stored on each drive. This is here for snapraid to be able to find the original location of a file during rebuilds and for snapraid to easily find changes in the content of the disks and update the parity accordingly.
This sync command works because snapraid is programmed to look inside /etc/ for a file named "snapraid.conf" which holds the information of the array (which I just made according to my desires).
Now, my problem is to figure out how to make snapraid handle two separate arrays because I can't place two "snapraid.conf"-files in the same location and I don't think the program would do anything with a differently named file. Furthermore, I lack the programming skills to be able to realize how to make a single ".conf"-file that can handle two arrays.
However, the ".conf"-file for my first array would be:
disk Disk1-3 /media/disk1-3/
disk Disk1-4 /media/disk1-4/
disk Disk1-5 /media/disk1-5/
disk Disk1-6 /media/disk1-6/
nohidden
-and then just add a lot of extentions that you don't want to sync
Now I come to the internet for help because it seems two separate arrays is not something people do, they just bunch everything together to make one big array, but I feel that is not suitable for my setup.
This is the only thing I found about multiple setups/arrays but that didn't really help me as a quite inexperienced user: https://sourceforge.net/p/snapraid/discussion/1677233/thread/7660517f/
I appreciate any help a can get, even if it is just saying "double arrays can't be done but you did a good job of understanding how to setup a single"
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's doable. Have even operated both banks simultaneously - parity drives are ata over ethernet.
Your plan looks very similar to my current setup. I just shrugged and used usb flashdrives for our content disks. before that was using a lot of network shares scattered all over the house.
The flash drives gave me a way to go really crazy with content files should I choose. 24 sockets on this board. Lotsa copies keep stuff safe....
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Just make two separate config files. Name them whatever you like. Then, whenever you run snapraid, pass it the -c config-filename option with the name of whichever of the two config files you want it to use.
By the way, you may want to rethink your second array if it has documents that change frequently. That is generally not a good thing with snapshot RAID.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Oh thank you so much! Your short answer suddenly made all of the pieces come together, finally! To think the solution was that simple makes me ashamed I didn't figure out that little command earlier... It goes to show that you should really read the instructions one more time before going scouting in the internet. Really fast response as well, I am impressed!
Yeah, I know it isn't the best solution to have a lot of frequently changed files on snapraid but if by "frequent" I mean like placing all the documents you have been working on during the day on the array for easy access and backup, is it still a very bad idea? How would you define "frequently" changed; two times a day or once every week?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It depends. Frequently changed usually means, frequent in comparison to how often you run sync. If the files get changed every hour, but you only run sync once a day, then some of your data will be at risk during the day.
The main thing to keep in mind is that if any files that were part of the last sync are modified, then not only are those files at risk, but an equivalent amount of data on the other data drives is also at risk until you run sync. Adding new data does not put data on other data drives at risk. But modifying files (or replacing files with changed versions of the same name) does put other data drives at risk.
Last edit: jwill42 2014-10-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Solid advice, I probably have to rethink that second array then. Maybe the easiest thing to do is make them into a raid 1 with the motherboard controller and have the last drive as a spare for when one breaks (2Tb for the second array was a bit overkill either way).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
An additional alternative would be to go tripple parity.
Then you would be able to loose 2 disks, and still restore, even if you have made unlimited changes to a single data disk. The "trick" is to keep all the changes (modify/delete) limited to a single data disk.
If the "frequently changed data disk" fails you would however loose all changes made to that disk since last sync.
Syncing every hour is most likely not a problem from snapraids perspective as you can sync a lot of gigabytes in an hour... But it might be a performance bottle neck while it is ongoing, which might make it more attractive to do at night when the family is supposed to be sleeping anyway.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ok, I've searched high and low for information on how to do thins but after two whole days i admint myself defeated and find myself in need to ask for help. I am new to both snapraid and the operating system ubuntu, so forgive me if this post seems to be written by someone who just got a computer but I need to be assured that my thoughts are correct before I do something that will corrupt everything.
The situation:
I've just completed building new computer that will serve as an all-round media and backup server for my whole family. I have 6x2Tb + 3x1Tb drives. (+ 1x120Gb SSD as OS-drive)
I want to make these two "collections" into two separate arrays like this:
Array 1:
2Tb - parity
2Tb - parity2
2Tb - content
2Tb - content... and so on
(this will be expanded in the future, therefore two paritys)
Array 2:
1Tb - parity
1Tb - content
1Tb - content
The 2Tb-array will be loaded with movies and other heavy files that will not change over time and will be on a slow sync schedule.
The 1Tb array will contain a lot of smaller files, like music and documents which are prone to change more often than movie files and will have a more frequent sync schedule.
If I understand the documentation correctly I have to create a configuration file called snapraid.conf and place it inside the "/etc/" folder. This file will hold information about the array and which drives that are part of it and where they are located (mounted).
When I run the sync command for the first time the snapraid program will place a ".parity"-file on the parity drives. This file is just a bunch of data it needs to be able to rebuild the other drives in case of failure.
So far so good, but this is where I become uncertain of what is going on
The program will also scatter a ".content"-file to the places of your choosing (you need a couple of these so you don't lose the only copy in case the drive it was placed on dies). This ".content"-file is some sort of index of all the files stored on each drive. This is here for snapraid to be able to find the original location of a file during rebuilds and for snapraid to easily find changes in the content of the disks and update the parity accordingly.
This sync command works because snapraid is programmed to look inside /etc/ for a file named "snapraid.conf" which holds the information of the array (which I just made according to my desires).
Now, my problem is to figure out how to make snapraid handle two separate arrays because I can't place two "snapraid.conf"-files in the same location and I don't think the program would do anything with a differently named file. Furthermore, I lack the programming skills to be able to realize how to make a single ".conf"-file that can handle two arrays.
However, the ".conf"-file for my first array would be:
parity /media/disk1-1/snapraid.parity
2-parity /media/disk1-2/snapraid.parity
content /media/disk1-1/snapraid.content
content /media/disk1-2/snapraid.content
content /media/disk1-3/snapraid.content
content /media/disk1-4/snapraid.content
content /media/disk1-5/snapraid.content
content /media/disk1-6/snapraid.content
disk Disk1-3 /media/disk1-3/
disk Disk1-4 /media/disk1-4/
disk Disk1-5 /media/disk1-5/
disk Disk1-6 /media/disk1-6/
nohidden
-and then just add a lot of extentions that you don't want to sync
Now I come to the internet for help because it seems two separate arrays is not something people do, they just bunch everything together to make one big array, but I feel that is not suitable for my setup.
This is the only thing I found about multiple setups/arrays but that didn't really help me as a quite inexperienced user:
https://sourceforge.net/p/snapraid/discussion/1677233/thread/7660517f/
I appreciate any help a can get, even if it is just saying "double arrays can't be done but you did a good job of understanding how to setup a single"
It's doable. Have even operated both banks simultaneously - parity drives are ata over ethernet.
Your plan looks very similar to my current setup. I just shrugged and used usb flashdrives for our content disks. before that was using a lot of network shares scattered all over the house.
The flash drives gave me a way to go really crazy with content files should I choose. 24 sockets on this board. Lotsa copies keep stuff safe....
Just make two separate config files. Name them whatever you like. Then, whenever you run snapraid, pass it the
-c config-filenameoption with the name of whichever of the two config files you want it to use.By the way, you may want to rethink your second array if it has documents that change frequently. That is generally not a good thing with snapshot RAID.
Oh thank you so much! Your short answer suddenly made all of the pieces come together, finally! To think the solution was that simple makes me ashamed I didn't figure out that little command earlier... It goes to show that you should really read the instructions one more time before going scouting in the internet. Really fast response as well, I am impressed!
Yeah, I know it isn't the best solution to have a lot of frequently changed files on snapraid but if by "frequent" I mean like placing all the documents you have been working on during the day on the array for easy access and backup, is it still a very bad idea? How would you define "frequently" changed; two times a day or once every week?
It depends. Frequently changed usually means, frequent in comparison to how often you run sync. If the files get changed every hour, but you only run sync once a day, then some of your data will be at risk during the day.
The main thing to keep in mind is that if any files that were part of the last sync are modified, then not only are those files at risk, but an equivalent amount of data on the other data drives is also at risk until you run sync. Adding new data does not put data on other data drives at risk. But modifying files (or replacing files with changed versions of the same name) does put other data drives at risk.
Last edit: jwill42 2014-10-30
Solid advice, I probably have to rethink that second array then. Maybe the easiest thing to do is make them into a raid 1 with the motherboard controller and have the last drive as a spare for when one breaks (2Tb for the second array was a bit overkill either way).
Or set your sync script to run or frequently say every hour?
Maybe, I suppose the syncs are fast enough for those small files to be able to do it every hour?
An additional alternative would be to go tripple parity.
Then you would be able to loose 2 disks, and still restore, even if you have made unlimited changes to a single data disk. The "trick" is to keep all the changes (modify/delete) limited to a single data disk.
If the "frequently changed data disk" fails you would however loose all changes made to that disk since last sync.
Syncing every hour is most likely not a problem from snapraids perspective as you can sync a lot of gigabytes in an hour... But it might be a performance bottle neck while it is ongoing, which might make it more attractive to do at night when the family is supposed to be sleeping anyway.