I am trying to perform my very first snapraid sync and I am having trouble:
Failed to grow parity file '/mnt/P01/snapraid.parity' to size 4347130806272 using fallocate due lack of space.
I am confused as to why that is since I have 24 4TB data disks and 4 4TB parity disks so they are the same size. I read the FAQ and followed the advice to format all four of my parity drives using "mkfs.ext4 -m 0 -T largefile4 DEVICE" to make the parity drives have a bit more capacity than my data drives so I would expect the parity to fit no problem. Interestingly enough, the first files it seems to have trouble with are on D05 -- that's what disk the first "outofparity" messages complain about. All in all, it complains about files on D05, D08, D09, D11, D12, and D14 with "outofparity" messages. It complains about a ton of files -- enough to generate 45MB of logs. I guess for some reason it doesn't see any problems with files on any of the other disks. This seems odd to me.
parity /mnt/P01/snapraid.parity
2-parity /mnt/P02/snapraid.parity
3-parity /mnt/P03/snapraid.parity
4-parity /mnt/P04/snapraid.parity
content /mnt/D21/snapraid.content
content /mnt/D22/snapraid.content
content /mnt/D23/snapraid.content
content /mnt/D24/snapraid.content
content /srv/snapraid/snapraid.content
data d01 /mnt/D01/
data d02 /mnt/D02/
data d03 /mnt/D03/
data d04 /mnt/D04/
data d05 /mnt/D05/
data d06 /mnt/D06/
data d07 /mnt/D07/
data d08 /mnt/D08/
data d09 /mnt/D09/
data d10 /mnt/D10/
data d11 /mnt/D11/
data d12 /mnt/D12/
data d13 /mnt/D13/
data d14 /mnt/D14/
data d15 /mnt/D15/
data d16 /mnt/D16/
data d17 /mnt/D17/
data d18 /mnt/D18/
data d19 /mnt/D19/
data d20 /mnt/D20/
data d21 /mnt/D21/
data d22 /mnt/D22/
data d23 /mnt/D23/
data d24 /mnt/D24/
exclude /lost+found/
For each file, even of few bytes, a whole block of parity is allocated, and with many files this may result in a lot of unused parity space. And when you completely fill the parity disk, you are not allowed to add more files in the data disks. Anyway, the wasted parity doesn't sum between data disks. Wasted space resulting from a high number of files in a data disk, limits only the amount of data in such data disk and not in others.
As approximation, you can assume that half of the block size is wasted for each file. For example, with 100000 files and a 256 KiB block size, you are going to waste 13 GB of parity, that may result in 13 GB less space available in the data disk.
Specific examples in your setup:
Example d01: 11 325 files x 0.25 MiB / 2 = ~1 415 MiB free space needed.
Example d09: 2 273 172 files x 0.25 MiB / 2 = ~284 146 MiB free space needed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ah! Thanks a bunch for your help. This is a very critical piece of information -- the fact that it's tucked away in the "BlockSize" setting documentation to me seems poor. The "Getting Started" section neglects to mention anything about parity file waste space and since other RAID mechanisms operate on the expectation that parity drive size does not need to be larger than data drive size I think it would be very valuable to include this information in the "Getting Started" section. I also see value in adding it to the FAQ under "Why in 'sync' do I get the error 'Failed to grow parity file 'xxx' to size xxx due lack of space.'?" since the info there right now does not mention this either. It seems to imply that you don't need any additional space beyond the size of the data drive. Unfortunately this means that four of these 4TB drives I bought for my server are essentially "useless" -- now I need to buy four 5TB drives. Just wish I would have known this before. Crossing my fingers that the old clunker I have these in even supports >4TB drives (I had to fiddle with stuff BIOS, HBA card BIOS, drivers, etc. just to get it to support >2TB).
Thanks again.
Last edit: Codicus Maximus 2016-08-02
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Wouldn't it make more sense to leave free space on the data disks?
Sure ~300 GiB free space on the 2 million+ files disk is a lot, but it is also much less than the unused space you will end up with on the parity disks...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am using mhddfs to pool the drives together so leaving free space on a drive is not an option since any free space will get used. I could set its "mlimit" to 300GB to keep that much free space; however, that means it would keep that much free space on ALL 24 drives equaling a 7200GB loss.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
One solution, granted fiddlier than one would like, would be to exclude some files/folders from snapraid on the disks that are close to full and have a large number of files (i.e. you don't need to keep the space free, you can use it but not have it protected by snapraid).
Apart from things you maybe just don't care (for example I have a lot of linux iso's and a few kiwix versions of wikipedia for offline use, saved the day many times when my internet was down... - no point in protecting those too much, anyway they get obsolete; if I lose them I just go and download the next current version when I need it) you might have things that SHOULD NOT BE INCLUDED in snapraid anyway, because they change. Like virtual machines or temporary download folders or any other file that is expected to change any time soon.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yeah, that's what I'm doing for now -- a combination of removing backed up folders from SnapRAID and rebalancing some files between drives in the mhddfs array. Ideally I would like it to be "set it and forget it" and not have to worry about the fiddliness or how many files are on each drive especially since I'm using mhddfs to pool all these volumes together; however, it saves me from having to buy more hard drives for now, heh.
Regarding files which change -- correct me if I'm wrong but from a data integrity standpoint, files which are excluded from being handled by SnapRAID but which still reside on any of the content drives may still cause inability to restore files if the excluded files change. In other words, files which change (such as your mentioned temporary downloads folder) should not reside on any of the drives configured as "data" in the config at all not simply be configured in an "exclude" statement.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Nope, files outside or excluded from the snapraid array have no effect on the files on the inside.
To make sure you are not overfilling disks you could put X GB junk files in an excluded folder on each disk. You still need to figure out roughly how many files you plan to put on each disk in order to put appropriate amount of junk there though.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am trying to perform my very first snapraid sync and I am having trouble:
I am confused as to why that is since I have 24 4TB data disks and 4 4TB parity disks so they are the same size. I read the FAQ and followed the advice to format all four of my parity drives using "mkfs.ext4 -m 0 -T largefile4 DEVICE" to make the parity drives have a bit more capacity than my data drives so I would expect the parity to fit no problem. Interestingly enough, the first files it seems to have trouble with are on D05 -- that's what disk the first "outofparity" messages complain about. All in all, it complains about files on D05, D08, D09, D11, D12, and D14 with "outofparity" messages. It complains about a ton of files -- enough to generate 45MB of logs. I guess for some reason it doesn't see any problems with files on any of the other disks. This seems odd to me.
Here are my disks:
My config file is as follows:
results of "snapraid status" command:
Last edit: Codicus Maximus 2016-08-02
From the manual:
Specific examples in your setup:
Example d01: 11 325 files x 0.25 MiB / 2 = ~1 415 MiB free space needed.
Example d09: 2 273 172 files x 0.25 MiB / 2 = ~284 146 MiB free space needed
Ah! Thanks a bunch for your help. This is a very critical piece of information -- the fact that it's tucked away in the "BlockSize" setting documentation to me seems poor. The "Getting Started" section neglects to mention anything about parity file waste space and since other RAID mechanisms operate on the expectation that parity drive size does not need to be larger than data drive size I think it would be very valuable to include this information in the "Getting Started" section. I also see value in adding it to the FAQ under "Why in 'sync' do I get the error 'Failed to grow parity file 'xxx' to size xxx due lack of space.'?" since the info there right now does not mention this either. It seems to imply that you don't need any additional space beyond the size of the data drive. Unfortunately this means that four of these 4TB drives I bought for my server are essentially "useless" -- now I need to buy four 5TB drives. Just wish I would have known this before. Crossing my fingers that the old clunker I have these in even supports >4TB drives (I had to fiddle with stuff BIOS, HBA card BIOS, drivers, etc. just to get it to support >2TB).
Thanks again.
Last edit: Codicus Maximus 2016-08-02
Wouldn't it make more sense to leave free space on the data disks?
Sure ~300 GiB free space on the 2 million+ files disk is a lot, but it is also much less than the unused space you will end up with on the parity disks...
I am using mhddfs to pool the drives together so leaving free space on a drive is not an option since any free space will get used. I could set its "mlimit" to 300GB to keep that much free space; however, that means it would keep that much free space on ALL 24 drives equaling a 7200GB loss.
One solution, granted fiddlier than one would like, would be to exclude some files/folders from snapraid on the disks that are close to full and have a large number of files (i.e. you don't need to keep the space free, you can use it but not have it protected by snapraid).
Apart from things you maybe just don't care (for example I have a lot of linux iso's and a few kiwix versions of wikipedia for offline use, saved the day many times when my internet was down... - no point in protecting those too much, anyway they get obsolete; if I lose them I just go and download the next current version when I need it) you might have things that SHOULD NOT BE INCLUDED in snapraid anyway, because they change. Like virtual machines or temporary download folders or any other file that is expected to change any time soon.
Yeah, that's what I'm doing for now -- a combination of removing backed up folders from SnapRAID and rebalancing some files between drives in the mhddfs array. Ideally I would like it to be "set it and forget it" and not have to worry about the fiddliness or how many files are on each drive especially since I'm using mhddfs to pool all these volumes together; however, it saves me from having to buy more hard drives for now, heh.
Regarding files which change -- correct me if I'm wrong but from a data integrity standpoint, files which are excluded from being handled by SnapRAID but which still reside on any of the content drives may still cause inability to restore files if the excluded files change. In other words, files which change (such as your mentioned temporary downloads folder) should not reside on any of the drives configured as "data" in the config at all not simply be configured in an "exclude" statement.
Nope, files outside or excluded from the snapraid array have no effect on the files on the inside.
To make sure you are not overfilling disks you could put X GB junk files in an excluded folder on each disk. You still need to figure out roughly how many files you plan to put on each disk in order to put appropriate amount of junk there though.