Menu

1 HDD with bad sectors but I cannot recover one single file

Help
Jnick
2018-02-09
2018-02-18
  • Jnick

    Jnick - 2018-02-09

    Hi everyone,

    i have snapraid 10 configured with 3 data drives and 1 parity drive. Today,
    one harddrive was discovered to have 41 bad sectors and there is a single
    video file which is corrupted. I figured this is no big deal since the
    content has been backed up with snapraid. Therefore I issue the command
    (elevated command prompt)

    "snapraid fix -f "filename.mp4"

    The recovery starts, gets to about 75% and I receive the following and it
    fails recovery:

    msg:fatal: Unexpected Windows error 23.

    msg:error: Error reading file
    'D:/PoolPart.52ecd30f-4ea2-413f-bb92-1a7fdeddc9c7/Shares/Video/filename.mp4'
    at offset 3664248832 for size 262144. Input/output error [5/23].

    error:3260895:d1:Shares/Video/filename.mp4: Read error at position 13978

    entry:0:change:lost:bad:d1:Shares/Video/filename.mp4:13978:

    strategy_error:3260895: No strategy to recover from 1 failures with 1
    parity without hash

    recover_sync:3260895:1: Failed with no attempts

    recover_unsync:3260895:1: Skipped for nothing to recover

    unrecoverable:3260895:d1:Shares/Video/filename.mp4: Unrecoverable error at
    position 13978

    msg:fatal: Unexpected Windows error 23.

    msg:error: Error reading file
    'D:/PoolPart.52ecd30f-4ea2-413f-bb92-1a7fdeddc9c7/Shares/Video/filename.mp4'
    at offset 3674996736 for size 262144. Input/output error [5/23].

    error:3260936:d1:Shares/Video/filename.mp4: Read error at position 14019

    entry:0:change:lost:bad:d1:Shares/Video/filename.mp4:14019:

    strategy_error:3260936: No strategy to recover from 1 failures with 1
    parity without hash

    recover_sync:3260936:1: Failed with no attempts

    recover_unsync:3260936:1: Skipped for nothing to recover

    unrecoverable:3260936:d1:Shares/Video/filename.mp4: Unrecoverable error at
    position 14019

    msg:fatal: Unexpected Windows error 23.

    msg:error: Error reading file
    'D:/PoolPart.52ecd30f-4ea2-413f-bb92-1a7fdeddc9c7/Shares/Video/filename.mp4'
    at offset 3676831744 for size 262144. Input/output error [5/23].

    error:3260943:d1:Shares/Video/filename.mp4: Read error at position 14026

    entry:0:change:lost:bad:d1:Shares/Video/filename.mp4:14026:

    strategy_error:3260943: No strategy to recover from 1 failures with 1
    parity without hash

    recover_sync:3260943:1: Failed with no attempts

    recover_unsync:3260943:1: Skipped for nothing to recover

    unrecoverable:3260943:d1:Shares/Video/filename.mp4: Unrecoverable error at
    position 14026

    msg:fatal: Unexpected Windows error 23.

    msg:error: Error reading file
    'D:/PoolPart.52ecd30f-4ea2-413f-bb92-1a7fdeddc9c7/Shares/Video/filename.mp4'
    at offset 3678666752 for size 262144. Input/output error [5/23].

    error:3260950:d1:Shares/Video/filename.mp4: Read error at position 14033

    entry:0:block:known:bad:d1:Shares/Video/filename.mp4:14033:

    msg:fatal: Unexpected Windows error 23.

    msg:fatal: Error reading file
    'D:/PoolPart.52ecd30f-4ea2-413f-bb92-1a7fdeddc9c7/Shares/Video/filename.mp4'.
    Input/output error [5/23].

    It almost appears that snapraid is trying to recover the file directly from
    it's source location, which is corrupted rather than rebuilding it using
    parity? I see a message saying it can't recover without a hash? Do I have
    something setup incorrectly?

    Thank you!
    John

     
  • Quaraxkad

    Quaraxkad - 2018-02-09

    SnapRAID is trying to read the existing file, which as you noted has bad sectors. This is the expected behaviour. Delete the file and then run fix. If the bad sectors are not writable in the future, the drive will automatically remap those sectors.

     

    Last edit: Quaraxkad 2018-02-09
  • Jnick

    Jnick - 2018-02-14

    Thanks for the information. I have deleted the file and ran the fix again. Note I only type the file name, not a full path to the file. However, when I do this now, I get a slew of messages stating:

    Reading data from missing file 'D:/PoolPart.52ecd30f-4ea2-413f-bb92-1a7fdeddc9c7
    /Shares/Video/filename.mp4' at offset 4824498176.
    Reading data from missing file 'D:/PoolPart.52ecd30f-4ea2-413f-bb92-1a7fdeddc9c7
    /Shares/Video/filename.mp4' at offset 4824760320.
    unrecoverable D:/PoolPart.52ecd30f-4ea2-413f-bb92-1a7fdeddc9c7/Shares/Video/filename.mp4
    100% completed, 19300 MB processed in 0:07
    
       18406 errors
       18402 recovered errors
           4 UNRECOVERABLE errors
    DANGER! There are unrecoverable errors!
    
    c:\snapraid-10.0>
    

    The filename is now 'filename.mp4.unrecoverable

    Therefore, I'm assuming it cannot rebuild the file? I'm ok with that becuase luckily I do have another copy of it elsewhere. However, I'm trying to determine what I did wrong that isn't allowing the Parity to work as it should. Considering if this was a whole drive failure and not an early detection, I would be up a creek without a paddle, as they say.

    Any help is appreciated!

    [EDIT] The D: drive is the drive that failed. I had deleted the file off of it as you stated.

     

    Last edit: Jnick 2018-02-14
  • Quaraxkad

    Quaraxkad - 2018-02-15

    Most likely the array was not fully synced. Run it again with "-l fix.log". The log file will give you more information about why those 4 blocks (out of 18,406) were unrecoverable. Chances are, one (or more) file(s) on either (or both) of the other two data drives in your array had changes or were deleted, and those changes/deletions were not synced.

    It's very important to keep the array synced at all times when you only have one parity drive. Adding parity drives not only increases the ability to recover from drive failures, but also the ability to recover from unsynced arrays.

     
  • Jnick

    Jnick - 2018-02-15

    Quaraxkad,

    Thanks for the information. I am running it again, enabling logging. With that said, the sync function runs everyday @ 4am. I have a script to run it scheduled in task scheduler and confirmed through the history report that it has been running. The one thing I will say is that I do not know when the failure first happened. It could have been days or weeks that past before I realized the drive was acting up. If this was the case, would the sync have screwed itself up if it was trying to sync AFTER the drive had already gone bad?

     
  • Jnick

    Jnick - 2018-02-15

    Ok, this is weird....

    I ran the command again and within 45 seconds, it was done. It said it was 100% successful. NOTE: I did not delete the recovered file I did the last time (that had 4 missing blocks). This was the log:

    msg:progress: Filtering...
    msg:verbose:    filename.mp4
    memory:used:190090858
    memory:block:17
    memory:chunk:88
    memory:file:192
    memory:link:88
    memory:dir:80
    msg:progress: Using 181 MiB of memory for the FileSystem.
    msg:progress: Initializing...
    msg:progress: Fixing...
    entry:0:change:lost:good:d1:Shares/Video/filename.mp4:13978:
    recover_sync:3260895:0: Skipped for already recovered
    entry:0:change:lost:good:d1:Shares/Video/filename.mp4:14019:
    recover_sync:3260936:0: Skipped for already recovered
    entry:0:change:lost:good:d1:Shares/Video/filename.mp4:14026:
    recover_sync:3260943:0: Skipped for already recovered
    entry:0:change:lost:good:d1:Shares/Video/filename.mp4:14074:
    recover_sync:3260991:0: Skipped for already recovered
    msg:status: Everything OK
    summary:error:0
    summary:error_recovered:0
    summary:error_unrecoverable:0
    summary:exit:ok
    

    The file name no longer has '.unrecoverable' appended to it. Am I to believe it is really fixed? I'll be able to check the file later tonight. I'm confused on why the file was successful now but wasn't yesterday. The only thing I can think of is when I initially ran the comman to fix it AFTER I deleted the file, I did NOT re-sync the array. The array would have re-synced overnight. Could that be the difference? Even though I deleted the file, the sync was still trying to pull it from the bad sectors?

     
  • Quaraxkad

    Quaraxkad - 2018-02-18

    The array would have re-synced overnight.

    Are you saying that a sync was ran after the first fix, but before this one? You should never allow a sync to run until after all repairs/restores are completed. That essentially updates the parity files with the wrong information.

    If that is the case, I don't know exactly what happened here. I don't think those 4 bad blocks were truly recovered, so I'm not sure why SnapRAID removed the .unrecoverable tag.

    If this was the case, would the sync have screwed itself up if it was trying to sync AFTER the drive had already gone bad?

    In short, no. Sync will not have used the "bad" file with unreadable sectors for parity updates. If it had tried to read that file it would have given you an error message in the log (your nightly automated task does save log files, right?).

     

Log in to post a comment.

MongoDB Logo MongoDB