Menu

check reports error count, but nothing listed

Help
MQMan
2015-06-10
2015-06-10
  • MQMan

    MQMan - 2015-06-10

    Hi,
    I just ran a "check" on my disks, which says there are 6870 errors, but it didn't show what these were.

    zentyal@zentyal:~$ sudo snapraid check
    Self test...
    Loading state from /poolmount/Data1/content...
    WARNING! With 6 disks it's recommended to use two parity levels.
    Scanning disk disk1...
    Scanning disk disk2...
    Scanning disk disk3...
    Scanning disk disk4...
    Scanning disk disk5...
    Scanning disk disk6...
    Using 829 MiB of memory.
    Initializing...
    Checking...
    100% completed, 5328784 MB processed in 5:02
    
        6870 errors
           0 unrecoverable errors
    WARNING! There are errors!
    

    Is there any way to see which files it thinks are in error before I (maybe) run a "fix".

    Also, running "status" immediately after the "check" gives:

    The oldest block was scrubbed 80 days ago, the median 59, the newest 0.
    
    No sync is in progress.
    No rehash is in progress or needed.
    No error detected.
    

    Interestingly, I haven't run a "status" or a "scrub" for a number of days, so I'm not sure why it reports the newest block scrubbed as 0 days.

    Cheers.

     
  • Quaraxkad

    Quaraxkad - 2015-06-10

    -v for verbose output and/or -l to save a detailed log will give you more info on the errors. Since status shows there are no blocks marked as bad, the errors are likely not a corruption and you'll probably have no need for running fix.

     
  • MQMan

    MQMan - 2015-06-10

    Thanks for the incredibly fast response.

    -v with or without -l scrolls everything to the screen, so I don't think that would be very useful.

    Re-running now with -l, which will take about 6 hours or so. Any idea what to "grep" for in the logfile as it's going to be too big to eyeball with 47k files reported on.

    After further reading, am I correct in assuming that the "status" report would only show errors found during a "scrub". Which means, a full 100%, 0 days, "scrub" might find all/some of them as the median age from the status is almost 60 days. Once the "check" finishes, would a 100%, 0 day, "scrub" be a worthwhile exercise.

    Cheers.

     

    Last edit: MQMan 2015-06-10
  • MQMan

    MQMan - 2015-06-10

    OK, here's the first section of errors from the log:

    status:correct:disk3:iTunes/Music/The Cranberries/Wake Up And Smell The Coffee/04 Dying Inside.mp3
    status:correct:disk3:iTunes/Music/The Cranberries/Wake Up And Smell The Coffee/09 I Really Hope.mp3
    status:correct:disk5:F1 Grand Prix/formula1.2015.spain.grand.prix.uncut.720p.hdtv.x264-champions.mkv
    parity_error:899293:parity: Data error
    parity_error:899294:parity: Data error
    parity_error:899295:parity: Data error
    parity_error:899296:parity: Data error
    status:correct:disk3:iTunes/Music/The Cranberries/Wake Up And Smell The Coffee/12 Carry On.mp3
    parity_error:899297:parity: Data error
    parity_error:899298:parity: Data error
    parity_error:899299:parity: Data error
    parity_error:899300:parity: Data error
    parity_error:899301:parity: Data error
    parity_error:899302:parity: Data error
    parity_error:899303:parity: Data error
    parity_error:899304:parity: Data error
    parity_error:899305:parity: Data error
    parity_error:899306:parity: Data error
    parity_error:899307:parity: Data error
    parity_error:899308:parity: Data error
    parity_error:899309:parity: Data error
    parity_error:899310:parity: Data error
    status:correct:disk3:iTunes/Music/The Cranberries/Wake Up And Smell The Coffee/10 Every Morning.mp3
    parity_error:899311:parity: Data error
    parity_error:899312:parity: Data error
    parity_error:899313:parity: Data error
    parity_error:899314:parity: Data error
    parity_error:899315:parity: Data error
    parity_error:899316:parity: Data error
    parity_error:899317:parity: Data error
    parity_error:899318:parity: Data error
    

    Any ideas about what's going on.

    Cheers.

     
  • MQMan

    MQMan - 2015-06-10

    OK, the "check" hasn't finished, but if I count the number of lines with parity_error to this point in time, I get 6870, so I don't expect to see any more errors from here on in.

    There are only 2 "blocks" of errors. The start of the first I pasted above and here is the start of the second:

    status:correct:disk3:iTunes/Music/Alter Ego/Transphormer/03 Beat the Bush.mp3
    status:correct:disk3:iTunes/Music/Alter Ego/Transphormer/05 Vincent van Dance.mp3
    status:correct:disk5:F1 Grand Prix/formula.e.2015.monaco.eprix.pdtv.x264-champions.mp4
    parity_error:908717:parity: Data error
    parity_error:908718:parity: Data error
    parity_error:908719:parity: Data error
    status:correct:disk3:iTunes/Music/Alter Ego/Transphormer/10 Transphormer.mp3
    parity_error:908720:parity: Data error
    parity_error:908721:parity: Data error
    parity_error:908722:parity: Data error
    parity_error:908723:parity: Data error
    parity_error:908724:parity: Data error
    parity_error:908725:parity: Data error
    parity_error:908726:parity: Data error
    parity_error:908727:parity: Data error
    parity_error:908728:parity: Data error
    parity_error:908729:parity: Data error
    parity_error:908730:parity: Data error
    parity_error:908731:parity: Data error
    parity_error:908732:parity: Data error
    parity_error:908733:parity: Data error
    parity_error:908734:parity: Data error
    parity_error:908735:parity: Data error
    parity_error:908736:parity: Data error
    parity_error:908737:parity: Data error
    status:correct:disk3:iTunes/Music/Alter Ego/Transphormer/07 Nasty Dollars.mp3
    parity_error:908738:parity: Data error
    parity_error:908739:parity: Data error
    

    Not sure if it's significant, but both sets of errors start immediately after a file on disk5.

    Cheers.

     

    Last edit: MQMan 2015-06-10
  • xad

    xad - 2015-06-10

    MQMan, was the sync interrupted?

    BTW! Asfaik, at sync the "blocktime" is set to now as it is assumed to be correct, ie scrub 0 days (an assumption discussed in another thread).

    /X

    The comment for the "parity_error:<block>:<parity_name>: Data error" is:
    / now check parities, but only if all the blocks have it computed /
    / if you check/fix after a partial sync, it's OK to have parity errors on the blocks with invalid parity /
    / and doesn't make sense to try to fix it /</parity_name></block>

     

    Last edit: xad 2015-06-10
  • MQMan

    MQMan - 2015-06-10

    No, I've never interrupted a "sync". Plus, if I run a "sync" now, it responds that there is nothing to do:

    zentyal@zentyal:~$ sudo snapraid -h sync
    Self test...
    Loading state from /poolmount/Data1/content...
    WARNING! With 6 disks it's recommended to use two parity levels.
    Scanning disk disk1...
    Scanning disk disk2...
    Scanning disk disk3...
    Scanning disk disk4...
    Scanning disk disk5...
    Scanning disk disk6...
    Using 823 MiB of memory.
    Initializing...
    Hashing...
    Nothing to do
    Syncing...
    Nothing to do
    

    The last "sync" was done about 2 days ago, so I'm not sure that it explains the 0 days on the "status".

    Cheers.

     
  • MQMan

    MQMan - 2015-06-10

    My assumption earlier regarding the lack of errors on the "status" was correct. I was because those blocks hadn't been scrubbed. Running a "scrub" with -p 100 -o 0 gave me:

    The oldest block was scrubbed 3 days ago, the median 0, the newest 0.
    
    No sync is in progress.
    No rehash is in progress or needed.
    DANGER! In the array there are 6870 errors!
    
    They are from block 899293 to 910875, specifically at blocks: 899293 899294 899295 899296 899297 899298 899299 899300 899301 899302 899303 899304 899305 899306 899307 899308 899309 899310 899311 899312 899313 899314 899315 899316 899317 899318 899319 899320 899321 899322 899323 899324 899325 899326 899327 899328 899329 899330 899331 899332 899333 899334 899335 899336 899337 899338 899339 899340 899341 899342 899343 899344 899345 899346 899347 899348 899349 899350 899351 899352 899353 899354 899355 899356 899357 899358 899359 899360 899361 899362 899363 899364 899365 899366 899367 899368 899369 899370 899371 899372 899373 899374 899375 899376 899377 899378 899379 899380 899381 899382 899383 899384 899385 899386 899387 899388 899389 899390 899391 899392 899393 and 6769 more...
    
    To fix them use the command 'snapraid -e fix'.
    The errors will disapper from the 'status' at the next 'scrub' command.
    

    So I guess it's time to run a "fix", unless anyone thinks this might be a "bad idea".

    BTW Is should be "disappear".

    Cheers.

     

Log in to post a comment.

MongoDB Logo MongoDB