I have been using snapraid for a while (since version 4.x), and have recovered from failures before but I am having a problem this time (using version 7.0)
I have 24 drives (all 3TB+) which has been my excessive tivo for years.
19 Data Drives
1 Online Spare
4 Parity drives
One of the data drives (d12) failed yesterday with about 1.5TB of data on it. I sync once a week, so I wasn't to concerned.
When I ran
sudo snapraid -d d12 -l fix.log fix
it ran for some time, and after restoring 300GB, it stopped the last notes being:
recover_sync:1205019:0: Skipped for already recovered
fixed:1205019:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 105
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 27787264.
error:1205020:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 106
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:106:
hash_import: Fixed entry 0
recover_sync:1205020:0: Skipped for already recovered
fixed:1205020:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 106
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28049408.
error:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 107
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:107:
then it just looks like it stopped, no other text.
running it again the text was:
mode:par4
parity:/drives/d21/p1-parity
2-parity:/drives/d22/p2-parity
3-parity:/drives/d20/p3-parity
4-parity:/drives/d24/p4-parity
content:/drives/d1/content2
memory:used:6172690266
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28049408.
error:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 107
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:107:
hash_import: Fixed entry 0
recover_sync:1205021:0: Skipped for already recovered
fixed:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 107
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28311552.
error:1205022:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 108
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:108:
hash_import: Fixed entry 0
recover_sync:1205022:0: Skipped for already recovered
fixed:1205022:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 108
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28573696.
error:1205023:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 109
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:109:
Any suggestions?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Looking over the system logs, it looks like a second drive is failing, and those errors occurred at the same time as my attempt to fix. Since I have 4 parity drives, I am going to pull that drive as well and reattempt again.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I ended up recovering 4 drives (2 failed completely, and 2 were failing during the recovery), but with 4 parity drives it recovered smoothly and perfectly.
Thanks for this amazing software!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Looks like you were wise to run four parity disks. I'm glad you were able to recover everything. I would agree that SnapRAID is great software. As a question, how did the SMART data look on the disks that failed, and do you track that information?
Last edit: rubylaser 2016-01-21
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I reinstalled the operating system on my server a few months back (after realizing that ubuntu 14.04 seamed to work better than 14.10) and forgot to reinstall the smartmon tools, so none of my scripts were reporting any errors because they didn't have the right library's to check.
The SMART data looked awful on those 2 drives, and the sector realloc counts were increasing on the other two, so I decided to pull them all and do a clean fix with 4 new drives in the array. SnapRAID worked perfectly and restored the disks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have been using snapraid for a while (since version 4.x), and have recovered from failures before but I am having a problem this time (using version 7.0)
I have 24 drives (all 3TB+) which has been my excessive tivo for years.
19 Data Drives
1 Online Spare
4 Parity drives
One of the data drives (d12) failed yesterday with about 1.5TB of data on it. I sync once a week, so I wasn't to concerned.
When I ran
sudo snapraid -d d12 -l fix.log fix
it ran for some time, and after restoring 300GB, it stopped the last notes being:
recover_sync:1205019:0: Skipped for already recovered
fixed:1205019:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 105
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 27787264.
error:1205020:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 106
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:106:
hash_import: Fixed entry 0
recover_sync:1205020:0: Skipped for already recovered
fixed:1205020:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 106
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28049408.
error:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 107
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:107:
then it just looks like it stopped, no other text.
running it again the text was:
mode:par4
parity:/drives/d21/p1-parity
2-parity:/drives/d22/p2-parity
3-parity:/drives/d20/p3-parity
4-parity:/drives/d24/p4-parity
content:/drives/d1/content2
memory:used:6172690266
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28049408.
error:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 107
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:107:
hash_import: Fixed entry 0
recover_sync:1205021:0: Skipped for already recovered
fixed:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 107
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28311552.
error:1205022:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 108
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:108:
hash_import: Fixed entry 0
recover_sync:1205022:0: Skipped for already recovered
fixed:1205022:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 108
Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28573696.
error:1205023:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 109
entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:109:
Any suggestions?
Looking over the system logs, it looks like a second drive is failing, and those errors occurred at the same time as my attempt to fix. Since I have 4 parity drives, I am going to pull that drive as well and reattempt again.
I ended up recovering 4 drives (2 failed completely, and 2 were failing during the recovery), but with 4 parity drives it recovered smoothly and perfectly.
Thanks for this amazing software!
Looks like you were wise to run four parity disks. I'm glad you were able to recover everything. I would agree that SnapRAID is great software. As a question, how did the SMART data look on the disks that failed, and do you track that information?
Last edit: rubylaser 2016-01-21
I reinstalled the operating system on my server a few months back (after realizing that ubuntu 14.04 seamed to work better than 14.10) and forgot to reinstall the smartmon tools, so none of my scripts were reporting any errors because they didn't have the right library's to check.
The SMART data looked awful on those 2 drives, and the sector realloc counts were increasing on the other two, so I decided to pull them all and do a clean fix with 4 new drives in the array. SnapRAID worked perfectly and restored the disks.