SnapRAID / Discussion / Help: UNRECOVERABLE errors :(

Hi all,
Have a problem with my array

C:\Array\Snapraid>"C:\Program Files\Snapraid\snapraid.exe" status
Self test...
Loading state from C:/Array/snapraid.content...
WARNING! With 5 disks it's recommended to use two parity levels.
Using 2156 MiB of memory for the file-system.
SnapRAID status report:

   Files Fragmented Excess  Wasted  Used    Free  Use Name
            Files  Fragments  GB      GB      GB
   30442      16      66       -    9443     544  94% d01
   24679      81     217       -    9460     533  94% d02
   21433      42     137       -    9528     470  95% d03
    2308       0       0       -    2965    7035  29% d04
     449       0       0       -     537   12686   4% d05
 --------------------------------------------------------------------------
   79311     139     420     0.0   31934   21269  60%


  8%|  *                                          *            **   *   ***
    |  *                *                         *            **   *   ***
    |  *                *                         *            **   *   ***
    |  *                *                         *            **   *   ***
    |  *                *                         *            **   *   ***
    |  *   *            *                         *            **   *   ***
    |  *   *            *                         *            **   *   ***
  4%|  *   *            *                         *            **   *   ***
    | ***  *          * *                         *            **   *   ***
    | ***  *          * *                         *            **   *   ***
    | *** ***         * *       *                 *            **   *   ***
    | *** ***         * *       *                 *            **   *   ***
    | *** ***         * *       *                 *            **   *   ***
    | *** ***         * *       *                 *            **   **  ***
  0%|****_****_______**_*____*_**_____________*_*_*_____*___*__**___**__***
    55                    days ago of the last scrub/sync                 1

The oldest block was scrubbed 55 days ago, the median 10, the newest 1.

No sync is in progress.
The full array was scrubbed at least one time.
No file has a zero sub-second timestamp.
No rehash is in progress or needed.
DANGER! In the array there are 1 errors!

They are from block 22937272 to 22937272, specifically at blocks: 22937272

To fix them use the command 'snapraid -e fix'.
The errors will disappear from the 'status' at the next 'scrub' command.

so I ran snapraid -e fix as suggested

C:\Array\Snapraid>"C:\Program Files\Snapraid\snapraid.exe" -e fix
Self test...
Loading state from C:/Array/snapraid.content...
WARNING! With 5 disks it's recommended to use two parity levels.
Searching disk d01...
Searching disk d02...
Searching disk d03...
Searching disk d04...
Searching disk d05...
Selecting...
Using 2173 MiB of memory for the file-system.
Initializing...
Selecting...
Fixing...
unrecoverable Misc/xxxxxx/vhd's/xxxxxx-SERVER-2.VHDX0 ETA
100% completed, 3778854 MB accessed in 2:38

       2 errors
       0 recovered errors
       1 UNRECOVERABLE errors
DANGER! There are unrecoverable errors!

C:\Array\Snapraid>

Any ideas why it can't fix it and how should i proceed from here?
Can I still use the scrub command to carry on checking the rest of the array?
I do have another copy of the file on an external disk, what would be the correct way to to try to replace the file assuming that the copy is still fine?
Can I just delete the file and run a sync to get the array back into a happy state?

Thanks for any help

First, to learn all we can about this error, do:

snapraid -v -S 22937272 -B 1 -l OneBlock.log check

Please post the result, and contents of the logfile.

Hi UhClem

Result from cmd window below and logfile attached

C:\Array\Snapraid>"C:\Program Files\Snapraid\snapraid.exe" -v -S 22937272 -B 1 -l OneBlock.log check
Self test...
Loading state from C:/Array/snapraid.content...
   79311 files
       0 hardlinks
       0 symlinks
       8 empty dirs
WARNING! With 5 disks it's recommended to use two parity levels.
Searching disk d01...
Excluding hidden 'C:/Array/01/array/Films/Hugo [2011](Digital)/Thumbs.db'
Excluding link 'C:/Array/01/array/Films/Men in Black [1997](Digital)/BDMV/BACKUP/JAR/00000/pig_scene_editing_workshop/menu/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/Men in Black [1997](Digital)/BDMV/BACKUP/JAR/00000/pig_scene_editing_workshop/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/Men in Black [1997](Digital)/BDMV/BACKUP/JAR/00000/pig_universal/black/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/Men in Black [1997](Digital)/BDMV/BACKUP/JAR/00000/pig_vfx/menu/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/Men in Black [1997](Digital)/BDMV/JAR/00000/pig_scene_editing_workshop/menu/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/Men in Black [1997](Digital)/BDMV/JAR/00000/pig_scene_editing_workshop/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/Men in Black [1997](Digital)/BDMV/JAR/00000/pig_universal/black/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/Men in Black [1997](Digital)/BDMV/JAR/00000/pig_vfx/menu/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding hidden 'C:/Array/01/array/Films/Prometheus [2012](Digital)/Thumbs.db'
Excluding hidden 'C:/Array/01/array/Films/Skyfall [2012](Digital)/Thumbs.db'
Excluding link 'C:/Array/01/array/Films/World War Z [2013](Digital)/BDMV/BACKUP/JAR/11111/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/World War Z [2013](Digital)/BDMV/JAR/11111/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/World War Z [2013](Digital)/BDMV/JAR/88888/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/World War Z [2013](Digital)/BDMV/JAR/88890/eng/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/World War Z [2013](Digital)/BDMV/JAR/88890/ita/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding link 'C:/Array/01/array/Films/World War Z [2013](Digital)/BDMV/JAR/88890/swe/Thumbs.db' for rule 'exclude Thumbs.db'
Excluding content 'C:/Array/01/array/snapraid.content'
Searching disk d02...
Excluding hidden 'C:/Array/02/array/IDE Drives/DVDs/Prime_Suspect/desktop.ini'
Excluding hidden 'C:/Array/02/array/IDE Drives/xxx/_/incoming/lawn5/desktop.ini'
Excluding content 'C:/Array/02/array/snapraid.content'
Excluding hidden 'C:/Array/02/array/TV/Citizen Khan [2012-2016](TV)/desktop.ini'
Excluding hidden 'C:/Array/02/array/TV/Have I Got News for You [1990-]/desktop.ini'
Excluding hidden 'C:/Array/02/array/TV/Mission Impossible [1966-1973](TVV)/desktop.ini'
Excluding hidden 'C:/Array/02/array/TV/Sliders [1995-2000](TVV)/desktop.ini'
Searching disk d03...
Excluding link 'C:/Array/03/array/Misc/xxxxxx/vhd's/xxxxxx-SERVER-2.VHDX.unrecoverable' for rule 'exclude *.unrecoverable'
Excluding content 'C:/Array/03/array/snapraid.content'
Excluding hidden 'C:/Array/03/array/TV/Fall, The [2013-2016](TVV)/desktop.ini'
Excluding hidden 'C:/Array/03/array/TV/Love Thy Neighbour [1972-1976](TVV)/desktop.ini'
Excluding hidden 'C:/Array/03/array/TV/Person of Interest [2011-2016](Digital)/desktop.ini'
Excluding hidden 'C:/Array/03/array/TV/UFO [1970-1973](TVV)/desktop.ini'
Excluding hidden 'C:/Array/03/array/Who/incoming/dr_who_temp/LOSTFILE/DIR1/desktop.ini'
Searching disk d04...
Excluding hidden 'C:/Array/04/array/TV/F1/2020/desktop.ini'
Excluding hidden 'C:/Array/04/array/TV/F1/2021/desktop.ini'
Excluding hidden 'C:/Array/04/array/TV/Tales of the Unexpected [1979-1988](TVV)/desktop.ini'
Searching disk d05...
Using 2173 MiB of memory for the file-system.
Initializing...
Selecting...
Checking...
100% completed, 1 MB accessed in 0:00

       2 errors
       1 UNRECOVERABLE errors
WARNING! There are errors!
DANGER! There are unrecoverable errors!

C:\Array\Snapraid>

OneBlock.log

UhClem - 2022-08-16

Thanks.
I'm hesitant to advise a next step in light of incomplete info:

Can I just delete the file and run a sync to get the array back into a happy state?

Did you, in fact, delete that file? (in the time period between your asking that question, and the time you acted upon my request)

Also,

Hi all,
Have a problem with my array

(Prior to you starting this thread,) It appears to me that there was an ERROR reported by SnapRAID, probably in a scrub. (Do you, by chance, have any record of such error?) [It is very important to take note of any/all ERROR reports, so that you can provide them when seeking help. Also, unresolved ERRORs can metastasize (yes, like cancer) with subsequent sync (or even fix) commands.]

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Paul French - 2022-08-16
  
  Hi UhClem,
  
  Did you, in fact, delete that file? (in the time period between your asking that question, and the time you acted upon my request)
  
  No, I've not made any changes to the files in the array since the error was reported. The filename has been changed, by I assume the program itself, to have .unrecoverable on the end.
  
  (Prior to you starting this thread,) It appears to me that there was an ERROR reported by SnapRAID, probably in a scrub. (Do you, by chance, have any record of such error?) [It is very important to take note of any/all ERROR reports, so that you can provide them when seeking help. Also, unresolved ERRORs can metastasize (yes, like cancer) with subsequent sync (or even fix) commands.]
  
  The error was reported by the last scrub that was done a couple of days ago as can be seen in the status report in the first post, no other errors were reported before this.
  As it was late I tried the "snapraid -e fix" command the next day which only took a few minutes to fail with the same message as in the first post, I may also have tried it again for a 2nd time afterwards with the same result.
  After coming on to the snapraid website I saw there was an update available so I downloaded that with the thought it might help, the result of that is what I posted in the first post with the main difference been it took over 2.5 hours to fail this time.
  
  Note that I can't be certain what version I was running before I updated it, sorry, but with the difference in time it took I assume it was pre the 11.6 update, but when I downloaded 12.1 I did notice that at some point I had already downloaded 12.0 as the zip file was there, but have no idea if I actually installed it.
  
  Something I have thought of, as the file has been renamed does the fix process attempt to create a new correct file and as such needs to have enough space to have another copy of the file, as there is not enough room to do that?
  
  Thanks for your help, let me know if you need any more info.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

UhClem - 2022-08-17

Thanks for the added info. (It was also good that you reminded me about the (renamed) .unrecoverable file)
I'm still a little bit uncomfortable about proceeding without knowing the precise nature of the error that SnapRAID reported. However, we can hit "Replay" and get a second look ... if you un-rename that file (by removing the ".unrecoverable" suffix, then you can do a

snapraid -v -f Misc/xxxxxx/vhd's/xxxxxx-SERVER-2.VHDX -l TwoLog.txt check

and attach the LogFile and post the command result [just the output after the "Initializing ..." line is OK]

[Realize that the check command makes NO changes at all (to data or parity or the .content files)]
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Paul French - 2022-08-17
  
  Hi UhClem,
  
  Some interesting extra information from something I tried last night.
  For the file in question I also had a .sha256 file created by HashCheck from when the file was synced into the array so i could double check it had copied properly compared to the original file.
  So I took a copy of the file to a different disk outside the array so I could rename it back and check it against the .sha256 file without affecting anything to do with the array.
  Expecting it to end with a fail was surprised to find that it ended with a match which to me means that the file hasn't changed, not sure what this means with regard to the error though.
  Will try your suggestion above when I get home in about 8 hours from posting this unless you want me to try something else instead or as well as.
  
  Thanks again.
  
  HashCheck Result.jpg
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Paul French - 2022-08-17
  
  Result below and logfile attached, note that I had to put a / in front of the Misc to get it to work.
  
  Using 2173 MiB of memory for the file-system. Initializing... Selecting... Checking... unrecoverable Misc/xxxxxx/vhd's/xxxxxx-SERVER-2.VHDX0 ETA 100% completed, 3735982 MB accessed in 2:36 2 errors 1 UNRECOVERABLE errors WARNING! There are errors! DANGER! There are unrecoverable errors! C:\Array\Snapraid>
  
  TwoLog.txt
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

OK, I've got a pretty good idea what happened.
Summary:
It's easily rectified (i.e., getting your array back to a clean state).

What I believe happened is that when this file was sync'd into the array, the hash for one (256KB) block (#33415 of the file) got glitched as it was being copied into the in-core database (of the .content file), and then got written into the .content files. This went unnoticed until that array_block was scrub'd; that would have produced a report of a "Data error" for this file/block#.

It appears that SnapRAID is "prejudiced" in this situation, believing the stored hash to be correct, and assuming the (file) data to be in error; so when you ran the fix command, SR used the other disks' data blocks and the parity block,for array_block 22937272, to re-create the correct data block for our subject file (which, of course, has been correct all the while). Fine and dandy, but the final verification is that the hash for this re-created data block matches the one stored in the .content file. Whoops!!! Hence, it squawks, and hands down a verdict of ".unrecoverable".

To get things clean again ...
BUT, before doing so, I'd like to get a logfile ("on record", so to speak) for the fix command, so please do:

snapraid -v -f /Misc/xxxxxx/vhd's/xxxxxx-SERVER-2.VHDX -l FixLog.txt fix

and attach to next reply. (Coommand output isn't needed.)

OK, safest way to put things right:
Be sure you have a good copy of this file OFF-array. Then delete the file from the array. Do a sync command. Copy the file into the array, Do another sync command. Now do a

snapraid -p new scrub

Through this procedure, keep an eye on the output for any error reports, of course.

Going forward, be very watchful, and wary, since that glitch might NOT be a one-time fluke.
You could look into running a "memtest" type program; also a Prime95 blend test.
"Once bitten, twice shy." :)

Hi UhClem,

All done and fixlog attached, no errors occurred during the process.

C:\Array\Snapraid>"C:\Program Files\Snapraid\snapraid.exe" status
Self test...
Loading state from C:/Array/snapraid.content...
WARNING! With 5 disks it's recommended to use two parity levels.
Using 2156 MiB of memory for the file-system.
SnapRAID status report:

   Files Fragmented Excess  Wasted  Used    Free  Use Name
            Files  Fragments  GB      GB      GB
   30442      16      66       -    9443     544  94% d01
   24679      81     217       -    9460     534  94% d02
   21433      42     137       -    9528     470  95% d03
    2308       0       0       -    2965    7035  29% d04
     449       0       0       -     537   12686   4% d05
 --------------------------------------------------------------------------
   79311     139     420     0.0   31934   21270  60%


 12%|                                                                     *
    |                                                                     *
    |                                                                     *
    |                                                                     *
    |                                                                     *
    | *                                                 *   *  * **       *
    | *              *                                  *   *  * **       *
  6%| *              *                                  *   *  * **       *
    | *  *           *                                  *   *  * **       *
    | *  *           *                                 **   *  * **       *
    |*** *         * *                                 **   *  * **       *
    |******        * *                                 **   *  * **       *
    |******        * *                                 **   *  * **       *
    |******        * *      *               *          **   *  * **       *
  0%|******_______**_*___**_*___________*_*_*_____*__*_**___**_*_**_______*
    60                    days ago of the last scrub/sync                 0

The oldest block was scrubbed 60 days ago, the median 15, the newest 0.

No sync is in progress.
The full array was scrubbed at least one time.
No file has a zero sub-second timestamp.
No rehash is in progress or needed.
No error detected.

Question about your thoughts on what happened though. As I read it your saying that the hash for that block was entered into the database wrongly somehow causing the error to happen when it tried to scrub that block, however I always do a "-p new scrub" after inserting any data and also the whole array has been fully scrubbed many times since that file was inserted.
Would that be pointing more to some other issue like memory problems then?
Thanks again for your help.

FixLog.txt

UhClem - 2022-08-21

however I always do a "-p new scrub" after inserting any data and also the whole array has been fully scrubbed many times since that file was inserted.

Ah-hah! Then, I must alter my "What I believe happened is ..." to:
At some point, long after that file was initially sync'ed (& -p new scrub'ed), either during the creation of the in-core database (when "Loading state from ...), or during its saving (when "Saving state to ...") a bit got flipped in the 16-byte hash for that file's 33415th data block. And, then, only when that data block was next scrub'ed (or was a "participant" in a sync) [i.e., a verification of its hash], would that original glitch have come to light, as a "Data error/hash mismatch".

Would that be pointing more to some other issue like memory problems then?

Note that in both assessment scenarios, it points to a memory problem (the crime is the same, but the crime scene is different).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

UNRECOVERABLE errors :(

A backup program for disk arrays

Forums

Help

UNRECOVERABLE errors :(

UNRECOVERABLE errors :(

A backup program for disk arrays

Forums

Help

UNRECOVERABLE errors :( document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

UNRECOVERABLE errors :(