Menu

A few ideas to decrease risk of lost data when everything goes wrong

Help
2017-04-04
2017-04-04
  • Leifi Plomeros

    Leifi Plomeros - 2017-04-04

    In the following thread a snapraid user called Stuza ran into a series of events that all contributed to data loss:
    https://sourceforge.net/p/snapraid/discussion/1677233/thread/9f010abd/

    First problem (not really a snapraid problem but still relevant)

    A) The array contained a drivepool pool and Drivepool was set to auto balance files between the disks
    B) Snapraid data disks was configured at root level.
    C) Drivepool uses a long seemingly random unique string for each disk in the pool like this:
    F:\PoolPart.2-34-12-34-52-34-52-345-345-234\ G:\PoolPart.56-745-744-35-23-453-425-34-256\

    A is by itself a very big problem for snapraid in case a disk is lost.
    A+B+C Turned out to be even worse since snapraid does not recognize moved files when they are in different folders.

    There is not really much that could be done in snapraid to prevent any of the above issues except possibly making snapraid less strict about folder structure when identifying copied files and perhaps adding a chapter in FAQ about things to be aware about if combining snapraid with drivepool.

    Second problem

    D) There was a significant number of small files updated (automatically by a third party) tool since last sync.
    E) Only single parity was used

    This is one of the fundamental drawbacks for any snapshot raid systems compared to traditional RAID systems.
    But maybe this issue could actually be mitigated by snapraid adding a "hotfile" backup feature like this?

    In config:

    hotfilebackuplocation C:\HotFilesBackup\
    hotfile *.nfo
    hotfile *.doc
    hotfile *.xls
    hotfile \SomeFolder\
    

    Snapraid would then make a normal file backup to the backup folder of any files matching the hotfile patterns before sync and calculate parity based on the files in the backup folder instead of the original file (this prevents the files from being unexpectedly updated during the sync).

    Later during fix, if one of the original files has been updated, a file from the backup folder could be used instead.

    Obviously this is not a small change, but at the same time I think it would have a really high probablity of helping users to avoid partial data loss.

    Possibly it could be combined with a warning message that user should consider using hotfile backup folder whenever an updated file is discovered in the array.

    Third problem:

    F) It is currently too easy to accidentally run snapraid sync before completing all possible recovery strategies. (Especially when many users are running scheduled sync jobs via third party scripts)

    As it is today, snapraid only prevents sync if an entire disk (or all content) is lost. In scenarios where user has been able to recover only a small part of data, all chances of additional recovery is lost if snapraid sync is accidentally started.

    I think it would be better if snapraid refused to sync unless user provides a forced sync parameter if any of the following conditions are true:
    - More than 10% files missing in a data disk since last sync.
    - User has run snapraid fix since last sync AND files are still missing since last sync.

     

    Last edit: Leifi Plomeros 2017-04-04
  • rubylaser

    rubylaser - 2017-04-04

    Thank you for this detailed writeup! I don't use SnapRAID on Windows or use Drivepool, but a section detailing the potential risks with Drivepool on the FAQ seems like a good idea. I also like the idea of a hotfile of some sort and I also like the idea of the need to run snapraid with the force parameter. Although, a well designed script should be able to mitigate most of this risk by running only if a minimum number of files are updated (set by the user) and a minimum number of files are deleted. I use 500 and 50 for those two numbers in my configuration.

     
  • Ronald Wells

    Ronald Wells - 2017-04-05

    I think these are all good ideas! thanks for continuing to improve snapraid!

     
  • Adam Kalyvas

    Adam Kalyvas - 2017-04-05

    Great suggestions!

     
  • Stuza

    Stuza - 2017-04-13

    Agreed! Anything that helps stop an accidental sync is a great idea.

    btw - now got 2x 8tb parity disks .... balancing OFF

     
  • John

    John - 2017-04-17

    I finally managed to sit down and parse the well-organized first post.

    I don't know anything about Drivepool but if it moves data around obviously it isn't good. Maybe it could be worked around by having snapraid check for missing files everywhere (ignoring the name/folder structure); somehow I was under the impression there is also something similar (but I think after all it was just a feature request). In any case it would take a lot of time. Still, probably it would be a very good idea to NOT use some pooling solution that moves files around...

    The backup idea is good and it has to be made integral to snapraid, you can't just go ahead and backup a folder because by the time snapraid reads it (and relies on it for parity) the content might be different.

    Also (F) is important. I have a feature request on one of the latest threads to do more or less the same. I know there are people who have various scripts that run snapraid diff first and then do the sync only if not much was changed but it's much better to have this integrated in snapraid (also safer, in case you get some missing folders just between snapraid diff and scan runs).,

     
  • cannondale0815

    cannondale0815 - 2017-04-21

    Zack's script takes care of checking if too many files have been deleted and prevents syncing in that case (on Linux, anyway):
    https://zackreed.me/updated-snapraid-sync-script/

     

Log in to post a comment.