Currently have a 7 disk protected SnapRAID that are all formatted ext4. I am curious if there are any benefits to using a COW filesystem such as ZFS or BTRFS instead of ext4? To my knowledge, using ZFS at least (unsure about btrfs), would provide built-in checksumming of the data. I guess it adds an extra layer of protection. Of course, I wouldn't get the other benefits of ZFS, but I'd be interested in finding out if using a more modern filesystem would bring any advantages.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Good question, I'll be interested to hear what others say. Just my thoughts: SnapRAID provides checksumming and scrubbing as well. Without any parity disks, ZFS will be able to identify checksum issues, but won't have another disk to correct the error with. ZFS will work, but I just don't see the point. I'd either use ZFS as it was intended (in a pool) or use SnapRAID with EXT4 backed disks. Just my two cents.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You're right, I hadn't thought that if you don't enable copies=2 (or more), you won't get checksumming. I believe you could still leverage a few other things of ZFS (snapshot, dedup, compression)
Here's what I'm thinking (maybe I'm over-complicating things), assuming you've installed the excellent zfswatcher service for Linux.
Each disk is a part of it's own single-disk zpool (let's called each zpool disk1..diskn or pdisk1..pdisk6 for parity)
Each zpool is mounted to a location on disk
SnapRAID conf file points to each disk's zpool mount point.
ZFS can share multiple "hot spares" across different pools, so you could assign a disk as a hotspare for all your pools (and if you're really into scripting, you may be able to automate the fix/repair in SnapRAID)
In the event a disk were to suddenly "drop" out, zfswatcher would trigger an e-mail that your zpool is degraded. Then you:
Remove the offending disk and add a new one (or if using the hot spare, the hot spare would kick in and you can move to #3)
Replace the disk in the zpool: zpool replace disk1 <failed_device_id> <new_device_id>
Fix the disk with SnapRAID: snapraid -d disk1 -l repair.log fix
I think this could be an interesting thing to try (I may just do it).
I've convinced myself however to approach my home-built NAS (using Debian 7.4) in a 2-pronged way:
SnapRAID for static media (videos, pictures, music)
2x4TB Mirrored zpool for all other non-static data.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I just use your last option. I use ZFS for my non-static data and SnapRAID for all of my static, home media storage. This works great and is easy to troubleshoot / understand :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I would suspect this initial speed difference is due to ZFS's ARC caching via system RAM. On a pool with data, I would expect these throughput numbers to be very close to each other. Also, how many data disks do you have in your SnapRAID config?
Last edit: rubylaser 2014-04-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
~11 hours to create parity on the ZFS (3TB)
~14 hours on ext4 (same 3TB drive)
I have 7 data disks and 1 parity disk in my SnapRAID config (yes, I'm a maverick living on the edge recreating my only parity disk and leaving myself at risk!)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hey all,
Currently have a 7 disk protected SnapRAID that are all formatted ext4. I am curious if there are any benefits to using a COW filesystem such as ZFS or BTRFS instead of ext4? To my knowledge, using ZFS at least (unsure about btrfs), would provide built-in checksumming of the data. I guess it adds an extra layer of protection. Of course, I wouldn't get the other benefits of ZFS, but I'd be interested in finding out if using a more modern filesystem would bring any advantages.
Good question, I'll be interested to hear what others say. Just my thoughts: SnapRAID provides checksumming and scrubbing as well. Without any parity disks, ZFS will be able to identify checksum issues, but won't have another disk to correct the error with. ZFS will work, but I just don't see the point. I'd either use ZFS as it was intended (in a pool) or use SnapRAID with EXT4 backed disks. Just my two cents.
You're right, I hadn't thought that if you don't enable copies=2 (or more), you won't get checksumming. I believe you could still leverage a few other things of ZFS (snapshot, dedup, compression)
Here's what I'm thinking (maybe I'm over-complicating things), assuming you've installed the excellent zfswatcher service for Linux.
In the event a disk were to suddenly "drop" out, zfswatcher would trigger an e-mail that your zpool is degraded. Then you:
zpool replace disk1 <failed_device_id> <new_device_id>snapraid -d disk1 -l repair.log fixI think this could be an interesting thing to try (I may just do it).
I've convinced myself however to approach my home-built NAS (using Debian 7.4) in a 2-pronged way:
I just use your last option. I use ZFS for my non-static data and SnapRAID for all of my static, home media storage. This works great and is easy to troubleshoot / understand :)
Curious find however...
Formatted my parity disk as a single-disk zpool and the parity creation was going at 650-700MiB/s.
Reformatted it to ext4, and it won't go over 450MiB/s.
Anecdotal at best, but curious nonetheless.
I would suspect this initial speed difference is due to ZFS's ARC caching via system RAM. On a pool with data, I would expect these throughput numbers to be very close to each other. Also, how many data disks do you have in your SnapRAID config?
Last edit: rubylaser 2014-04-14
Interesting.
~11 hours to create parity on the ZFS (3TB)
~14 hours on ext4 (same 3TB drive)
I have 7 data disks and 1 parity disk in my SnapRAID config (yes, I'm a maverick living on the edge recreating my only parity disk and leaving myself at risk!)
That is a pretty substantial difference in sync speeds. What sort of speeds are you seeing (MB/s) and how much data are you syncing?
Total data about 14TB.
I'm seeing 600+ MiB/s, at least thats what the sync was reporting (on ZFS).. For ext4, it was in the 400's
Well, there is no reason why you can't / shouldn't use ZFS, so if you are seeing better speeds, I'd go for it.