Menu

Restoring after drive failure

Help
J. Kenney
2014-12-27
2014-12-29
  • J. Kenney

    J. Kenney - 2014-12-27

    I have been using snapraid for a while (since version 4.x), and have recovered from failures before but I am having a problem this time (using version 7.0)

    I have 24 drives (all 3TB+) which has been my excessive tivo for years.
    19 Data Drives
    1 Online Spare
    4 Parity drives

    One of the data drives (d12) failed yesterday with about 1.5TB of data on it. I sync once a week, so I wasn't to concerned.

    When I ran
    sudo snapraid -d d12 -l fix.log fix

    it ran for some time, and after restoring 300GB, it stopped the last notes being:

    recover_sync:1205019:0: Skipped for already recovered
    fixed:1205019:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 105
    Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 27787264.
    error:1205020:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 106
    entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:106:
    hash_import: Fixed entry 0
    recover_sync:1205020:0: Skipped for already recovered
    fixed:1205020:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 106
    Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28049408.
    error:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 107
    entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:107:

    then it just looks like it stopped, no other text.

    running it again the text was:

    mode:par4
    parity:/drives/d21/p1-parity
    2-parity:/drives/d22/p2-parity
    3-parity:/drives/d20/p3-parity
    4-parity:/drives/d24/p4-parity
    content:/drives/d1/content2
    memory:used:6172690266
    Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28049408.
    error:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 107
    entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:107:
    hash_import: Fixed entry 0
    recover_sync:1205021:0: Skipped for already recovered
    fixed:1205021:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 107
    Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28311552.
    error:1205022:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 108
    entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:108:
    hash_import: Fixed entry 0
    recover_sync:1205022:0: Skipped for already recovered
    fixed:1205022:d12:TiVo/House/House - S03E20 - House Training.mkv: Fixed data error at position 108
    Reading missing data from file '/drives/d12/share/TiVo/House/House - S03E20 - House Training.mkv' at offset 28573696.
    error:1205023:d12:TiVo/House/House - S03E20 - House Training.mkv: Read error at position 109
    entry:0:block:known:bad:d12:TiVo/House/House - S03E20 - House Training.mkv:109:

    Any suggestions?

     
  • J. Kenney

    J. Kenney - 2014-12-27

    Looking over the system logs, it looks like a second drive is failing, and those errors occurred at the same time as my attempt to fix. Since I have 4 parity drives, I am going to pull that drive as well and reattempt again.

     
  • J. Kenney

    J. Kenney - 2014-12-29

    I ended up recovering 4 drives (2 failed completely, and 2 were failing during the recovery), but with 4 parity drives it recovered smoothly and perfectly.

    Thanks for this amazing software!

     
    • rubylaser

      rubylaser - 2014-12-29

      Looks like you were wise to run four parity disks. I'm glad you were able to recover everything. I would agree that SnapRAID is great software. As a question, how did the SMART data look on the disks that failed, and do you track that information?

       

      Last edit: rubylaser 2016-01-21
  • J. Kenney

    J. Kenney - 2014-12-29

    I reinstalled the operating system on my server a few months back (after realizing that ubuntu 14.04 seamed to work better than 14.10) and forgot to reinstall the smartmon tools, so none of my scripts were reporting any errors because they didn't have the right library's to check.

    The SMART data looked awful on those 2 drives, and the sector realloc counts were increasing on the other two, so I decided to pull them all and do a clean fix with 4 new drives in the array. SnapRAID worked perfectly and restored the disks.

     

Log in to post a comment.

MongoDB Logo MongoDB