Hello, I'm stuck with errors reported by the "snapraid check" command. Here's the output:
$ snapraid check
Self test...
Loading state from /media/data0/content...
Searching disk data0...
Searching disk data1...
Searching disk data2...
Using 2596 MiB of memory for the FileSystem.
Initializing...
Checking...
100% completed, 6130217 MB accessed in 6:01
1088errors0unrecoverableerrors
WARNING! There are errors!
After running a check I ran status expecting to see the 1088 errors reported there as well, but got "No error detected". I then tried fixing the errors:
$ snapraid -e fix
Self test...
Loading state from /media/data0/content...
Searching disk data0...
Searching disk data1...
Searching disk data2...
Filtering...
Using 2596 MiB of memory for the FileSystem.
Initializing...
Fixing...
Nothing to do
Everything OK
"Nothing to do"? That's odd. I then ran a full scrub:
$ snapraid scrub -p 100
Self test...
Loading state from /media/data0/content...
Using 2016 MiB of memory for the FileSystem.
Initializing...
Scrubbing...
Using 40 MiB of memory for 64 blocks of IO cache.
100% completed, 5279488 MB accessed in 5:04
Everything OK
Saving state to /media/data0/content...
Saving state to /media/data1/content...
Saving state to /media/data2/content...
Verifying /media/data0/content...
Verifying /media/data1/content...
Verifying /media/data2/content...
Then another snapraid check to see if the errors remained:
$ snapraid check
Self test...
Loading state from /media/data0/content...
Searching disk data0...
Searching disk data1...
Searching disk data2...
Using 2596 MiB of memory for the FileSystem.
Initializing...
Checking...
100% completed, 6130217 MB accessed in 6:00
1088errors0unrecoverableerrors
WARNING! There are errors!
So no difference. What else can I try to remove the 1088 errors besides rebuilding the full parity? Also, how can I see which files are affected by the errors? Running snapraid 11.0 on Ubuntu 14.04.
Last edit: tonwa 2017-03-18
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It is false errors reported by the check function for unused parity blocks which have not been cleared (filled with zeroes).
The logic for scrub and apparently also fix -e does not agree that it is a problem (which it isn't, since whatever is inside these blocks is not used for anything and would be overwritten if they become used again).
Easiest fix is to simply ignore it.
But if you don't want to ignore it, you can get rid of the errors message like this:
Run snapraid status to find out which blocks have "errors"
Run snapraid fix -S FirstErrorBlock -B 1088
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the info. The problem is that snapraid status reports no error so I'm unable to determine FirstErrorBlock.
$ snapraid status -v
Self test...
Loading state from /media/data0/content...
2355511 files
0 hardlinks
95932 symlinks
41853 empty dirs
Using 2016 MiB of memory for the FileSystem.
SnapRAID status report:
The oldest block was scrubbed 10 days ago, the median 1, the newest 1.
No sync is in progress.
The 5% of the array is not scrubbed.
No file has a zero sub-second timestamp.
No rehash is in progress or needed.
No error detected.
I suppose I'll follow your recommendation and simply ignore the error message from snapraid check.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm getting the same errors as tonwa and there I'm also not getting any references to which blocks are "bad" in the status however if I run the check with full logging I think the number Leifi is suggesting I use in the fix command is 20055991. Can someone confirm that is correct?
The first line of errors from check log is below.
parity_error:20055991:parity: Data error, diff bits 1037974
Hello, I'm stuck with errors reported by the "snapraid check" command. Here's the output:
After running a check I ran status expecting to see the 1088 errors reported there as well, but got "No error detected". I then tried fixing the errors:
"Nothing to do"? That's odd. I then ran a full scrub:
Then another snapraid check to see if the errors remained:
So no difference. What else can I try to remove the 1088 errors besides rebuilding the full parity? Also, how can I see which files are affected by the errors? Running snapraid 11.0 on Ubuntu 14.04.
Last edit: tonwa 2017-03-18
It is false errors reported by the check function for unused parity blocks which have not been cleared (filled with zeroes).
The logic for scrub and apparently also fix -e does not agree that it is a problem (which it isn't, since whatever is inside these blocks is not used for anything and would be overwritten if they become used again).
Easiest fix is to simply ignore it.
But if you don't want to ignore it, you can get rid of the errors message like this:
Thanks for the info. The problem is that snapraid status reports no error so I'm unable to determine FirstErrorBlock.
(Chart removed)
I suppose I'll follow your recommendation and simply ignore the error message from snapraid check.
I'm getting the same errors as tonwa and there I'm also not getting any references to which blocks are "bad" in the status however if I run the check with full logging I think the number Leifi is suggesting I use in the fix command is 20055991. Can someone confirm that is correct?
The first line of errors from check log is below.
parity_error:20055991:parity: Data error, diff bits 1037974
tonwa: you can get logs by using -l FILENAME
Last edit: mrmessyau 2017-03-22
OK so I looking into exactly what the fix command that Leifi posted does and realised there was no risk to just going ahead and trying the command.
Based on the result it looks like it's all fixed now!