Menu

Wanting to start using SnapRaid but I have drives of varying sizes and at 100% full

Help
Bob Denero
2017-01-12
2017-01-16
  • Bob Denero

    Bob Denero - 2017-01-12

    I want to start using SnapRAID and have drives of varying sizes and some at 100% full.

    Filesystem      Size  Used Avail Use% Mounted on
    
    /dev/sda2        78G   31G   44G  42% /
    
    /dev/sde1       3.6T  3.4T   16G 100% /drive2
    /dev/sdd1       3.6T  3.0T  460G  87% /drive3
    /dev/sdc1       4.6T  4.3T   13G 100% /drive1
    /dev/sdb1       4.6T  3.9T  456G  90% /drive4
    

    Questions:
    1) What size parity drive should I get? Is 5 TB suffciant or do I need larger?
    2) SD[B-D] are USB3 but the data hardly gets modified or added to, just read. Is SATA really still suggested?
    3) With about 16 TB of data (zillions of files) and at near capacity, is this an issue?
    4) Is there a youtube video or something that shows setup to drives with existing data? I really dont want to "pioneer" on live data and realize I misinterpreted an instruction which is now zapping terabytes of data

    Thank you in advance

     
  • Leifi Plomeros

    Leifi Plomeros - 2017-01-12

    1:
    With default block size of 256 kiB you typically need a parity disk with 1 GiB extra space per ~8,000 files compared to the "fullest" data disk (in worst case with all files being ultra tiny you may need 1 GiB extra space per ~4,000 files).

    If you have 8,000 files on a data disk with a total file size of 5.00 TB then you need a parity disk that can hold 5.001 TB.

    So in most scenarios it makes more sense to have parity disk in same size as the largest data disk and just leave some empty space on the data disks instead of adding a larger parity disk with 0.999 TB unused space.

    2: No idea. I use usb disks for parity without problems.
    3: Yes, if you literally have zillions of data files you need a parity disk larger than universe... Otherwise see answer 1 :-)
    4: No need to worry. Snapraid only writes to data disks when you run the fix and touch commands. So, building the array is read only for the data disks.

    Consider only adding a small folder as data disk before adding all data disks. That way you can experiment without building terabytes of parity.

     

    Last edit: Leifi Plomeros 2017-01-12
  • Karsten Kruse

    Karsten Kruse - 2017-01-13

    What Leifi wrote. Adding to that, here is my 2 cents:

    1) 5TB should be enough. Use tune2fs to disable reserved blocks on the parity drive, that should give you enough extra space for snapraid overhead, asuming that you did not disable reserved blocks on your other disks. If you did, just make sure you leave a few GB on your data disks free.

    2) I started with all my disks in USB3 cases. Worked fine, albeit slower than SATA. Not sure if this a probem with USB3 or just my motherboard. Problem was, i couldn't spin them down at all, so i bought a proper computer case and have them in there now. Now only the disk i read from is spinning, thus saving a bit of energy. My parity is still in an USB3 case.

    3) I recommend using a disk smaller than the universe, otherwise energy costs will be really high otherwise. If your budget is unlimited, an infinite energy bill is no problem, obviously.

    4) There is little chance of failure, even if something goes wrong. Worst case: You start writing parity to a data disk, thus run out of space and have to delete the parity file manually. Creating the initial parity will stress your disks a bit, so make sure you have your disks cooled at least a bit.

    This is a config that could work for you (read the comments in the example config, they explain everything):

    # The parity disk is mounted to /snapraidparity
    parity /parity/snapraid.parity
    
    # And here are the precious data disks mounted
    data data01 /drive1
    data data02 /drive2
    data data03 /drive3
    data data04 /drive4
    
    # I like to keep a few copies of the content file
    content /parity/snapraid.content
    content /drive1/.snapraid.content
    content /drive2/.snapraid.content
    content /drive3/.snapraid.content
    content /drive4/.snapraid.content
    
    # Stuff to ignore, starting from the disks mountpoint
    exclude *.unrecoverable
    exclude /tmp/
    exclude /lost+found/
    exclude /backups/
    

    Basically you build snapraid, read its documentation and put the config mentioned to /etc/snapdrive.conf. Then you run "snapraid sync".

     
  • Bob Denero

    Bob Denero - 2017-01-13

    Thank you for the feedback
    1) I ordered an 8tb drive as it was not much more than a 5tb - overkill wont hurt (right?)
    2) Good to know USB is not a showstopper but a performance hit
    3) This is a tiny server, not a data farm (lol). The electric bill for the entire house is $100-$250 depending on the season - so the power saving is not in my list of priorities although spinning down to extend life is a nice idea.
    4) I have been reading and I get the idea/concept better of what is happening.

    I do have another concern is that my machine may be underpowered

    4 Gig ram
    System:    Host: miniserver2 Kernel: 4.3.0-040300-generic x86_64 (64 bit gcc: 5.2.1)
               Desktop: Cinnamon 2.8.8 (Gtk 3.10.8~8+qiana) Distro: Linux Mint 17.3 Rosa
    Machine:   Mobo: ASUSTeK model: VM40B v: Rev 1.xx Bios: American Megatrends v: 1202 date: 03/24/2014
    CPU:       Dual core Intel Celeron 1007U (-MCP-) cache: 2048 KB
               flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 5986
               clock speeds: max: 1500 MHz 1: 1500 MHz 2: 1500 MHz
    Drives:    HDD Total Size: 18093.5GB (88.0% used) ID-1: /dev/sda model: OCZ size: 90.0GB
               ID-2: USB /dev/sdb model: Expansion_Desk size: 5001.0GB
               ID-3: USB /dev/sdc model: Backup+_Desk size: 5001.0GB
               ID-4: USB /dev/sdd model: My_Book_1230 size: 4000.8GB
               ID-5: USB /dev/sde model: Expansion_Desk size: 4000.8GB
    

    it also runs a MySQL database

    I am wondering with the 1st sync take weeks to generate parity? If so, I can still access the info (it wont be updated).
    If I do update or add, would the next sync be quicker or will it check every file for a change again?
    *Manual doesnt (clearly) state this info

     
  • Leifi Plomeros

    Leifi Plomeros - 2017-01-14

    You can run snapraid -T to make snapraid test how fast your CPU is for different operations

    C:\Snapraid>snapraid -T
    snapraid v11.0 by Andrea Mazzoleni, http://www.snapraid.it
    Compiler gcc 4.9.3
    CPU GenuineIntel, family 6, model 60, flags sse2 ssse3 crc32 avx2
    Memory is little-endian 64-bit
    Support nanosecond timestamps with futimens()
    
    Speed test using 8 data buffers of 262144 bytes, for a total of 2048 KiB.
    Memory blocks have a displacement of 1792 bytes to improve cache performance.
    The reported values are the aggregate bandwidth of all data blocks in MB/s,
    not counting parity blocks.
    
    Memory write speed using the C memset() function:
      memset   22817
    
    CRC used to check the content file integrity:
       table    1367
       intel   10401
    
    Hash used to check the data blocks integrity:
                best murmur3 spooky2
        hash spooky2    5137   15181
    
    RAID functions used for computing the parity with 'sync':
                best    int8   int32   int64    sse2   sse2e   ssse3  ssse3e    avx2   avx2e
        gen1    avx2           14568   27596   48078                           56788
        gen2    avx2            4089    6731   20865   23736                   33385
        genz   avx2e            2327    3355   11194   11977                           20673
        gen3   avx2e     814                                   10086   11449           19851
        gen4   avx2e     650                                    7633    8870           16126
        gen5   avx2e     544                                    6505    7004           13001
        gen6   avx2e     420                                    5074    5955           10442
    
    RAID functions used for recovering with 'fix':
                best    int8   ssse3    avx2
        rec1    avx2    1216    3124    2912
        rec2    avx2     545    1405    1631
        rec3    avx2     113     702    1017
        rec4    avx2      72     459     710
        rec5    avx2      52     291     524
        rec6    avx2      40     224     376
    
    If the 'best' expectations are wrong, please report it in the SnapRAID forum
    

    In the above example it tells me that my CPU allows snapraid to sync (update parity) at 56788 MiB/s if I have single parity and at 10442 MiB/s if I have 6 parity levels. Which is more than enough for me since my disks never exceed 1500 MiB/s combined speed.

    Fix (recovery) is much slower and can "only" process 2912 MiB/s for single level or parity used and at 376 MiB/s for 6 levels of parity used.

    When you sync all data disks will be read in parallell with the parity writes, so the time it takes is normally limited by the slowest/largest disk, unless you have another bottleneck.

    If limited only by disk speed you can expect that it will take less than a day to complete, unless you have chosen to use an SMR disk for parity in which case it could take up to 3 days.

    After the first sync only affected parts of parity is updated, which means if you add or remove a single file you can expect sync to complete in seconds.

     

    Last edit: Leifi Plomeros 2017-01-14
  • Bob Denero

    Bob Denero - 2017-01-16

    I got my 8 tb external usb 3 drive, I partitioned it to 6 parity and 2 gig live data (which I will start using for active stuff).
    It took 18.5 hours to sync (create parity) 15.8 TB of data (235 MB/s)
    I deleted a file and added a few small files and did a sync
    I modified a large file (150 gig file), deleted a couple of files and added some pictures
    It took 2 hours to sync the next time

    I am happy

     

Log in to post a comment.