Menu

Confusing Error when Syncing over LAN

Help
shag00
2015-01-11
2015-02-05
  • shag00

    shag00 - 2015-01-11

    Can anybody explain the following error (extract of terminal window):

    Initializing...
    Syncing...
    100% completed, 19698 MiB processed in 0:12
    Time for disk: 0% 0% 0% 0% 2% 0% 0% 0%
    Time for parity: 0% 2-parity:94%
    Everything OK
    Error synching parity file '/media/scott/Win1/snapraid.2-parity'. Input/output error.
    DANGER! Unexpected sync error in 2-Parity disk.
    Saving state to /var/snapraid/snapraid.content...

    Specifically, I do not follow how I get an EVERYTHING OK message immediately followed by a DANGER message.

    What I am attempting to do is add a second parity drive on a Windows 7 share drive on my LAN, which I have just installed, to be accessed by Snapraid which is on a Ubuntu 14.10 machine. The Windows drive is mounted using Cifs in fstab.

    Of note Snapraid appears to do something when it is finishing a sync that is perhaps causing the problem as regardless of size of the sync it appears to slow down (almost freeze) just before it is about to finish. As an example I did a force full sync just prior to this sync (I got an earlier message that the disk was was not large enough despite it being 4TB like parity 1) for a 30TB array and after going for 30+ hours it hung right at the end. The extract above refers to a sync of a 3TB file I added and right at the end it goes very slowly (10 minutes).

     
  • John

    John - 2015-01-11

    The beauty of having the source available. This is coming from:

    if HAVE_FSYNC
            int ret;
    
            /* Ensure that data changes are written to disk. */
            /* This is required to ensure that parity is more updated than content */
            /* in case of a system crash. */
            ret = fsync(parity->f);
            if (ret != 0) {
                    /* LCOV_EXCL_START */
                    fprintf(stderr, "Error synching parity file '%s'. %s.\n", parity->path, strerror(errno));
                    return -1;
                    /* LCOV_EXCL_STOP */
            }
    endif
    

    I would assume fsync doesn't work over network shares; you can do in config.h:

    define HAVE_FSYNC 0

    and recompile.

     

    Last edit: John 2015-01-11
  • shag00

    shag00 - 2015-01-11

    John,

    I really appreciate you posting a reply although it is as clear as mud to me (the issues being at my end I realise). To check I understand what you are saying, the above code is a (stand alone?) bash script to run on unix and the blue lines should be commented out (#)?

    I also have no idea what HAVE_FSYNC should be or is.

    To add some more information parity 1 is on the unix machine while the new parity 2 is on windows. Also, after the danger message the parity 2 drive disappears from windows file explorer (and unix) and can only be seen again after a windows reboot.

    At the moment I am doing a snapraid check which wont finish for another 15-20 hours however I did check the file sizes of both parity files and they are identical.

     
  • John

    John - 2015-01-12

    That isn't a bash script, is the actual source code from the actual code that runs on your machine and gives you that error message. Never mind the details of the actual content (also there might be some more comments which I messed while trying to make it somehow decently formatted for the forum), the point is that:

    • the error comes from calling that fsync C library function which does more or less what the name says
    • this can be disabled I (think at compile-time) so you or somebody else can build a version of the software that doesn't try to do this fsync (the reason to do this is obvious, if the device where the file lies doesn't support this fsync)

    HOWEVER, if the share disappears like it happens in your case there might be some other issue, not related to snapraid. I don't know much about your setup to do any useful comment about where this might be coming from. If you have such a big array can't you give up one data disk to add it to parity and get rid of the "parity over network" setup?

     
  • shag00

    shag00 - 2015-01-13

    John, again thanks for your reply to help me understand things software.

    The reason for my investigating these options comes about because of my hardware setup, a single case and 10 sata ports. I have also investigated improvements there, which it looks like I will need to implement. I was looking for an easy way out to expand, leveraging the power of a LAN, especially as I have powerful PCs attached that would be more fully utilised.

    I do not feel sufficiently confident to compile custom software so I will skip that option. Having said that, I feel that "parity over network" is a good idea which over time I suspect more people would implement because for domestic users like myself your array fills up over the years and you just want all your movies in a single computer case without extra boxes and racks. As I have found, it all comes down to how many sata ports are on your motherboard and nearly everyone has a LAN and multiple PCs with spare sata ports...

    I placed an earlier post some time ago asking if it were possible to which I received a positive response and so decided to proceed. I was warned this setup would be slow but actually its marginal as my LAN speed is close to the maximum write speed of my disks. It took around 30 hours to do a fresh sync writing to the windows share folder which is very similar to what I would expect it to take if just writing to the ubuntu disk.

    Anyway, thanks again.

     
  • Andrea Mazzoleni

    Hi shag00,

    In true that fsync() operation is not expected to fail on a network disk. It could be a result of some kind of malfunction in the network.

    Anyway, if you want to disable that fsync operations, you could use the mount option "nostrictsync" in Linux.
    See: https://www.kernel.org/doc/readme/Documentation-filesystems-cifs-README

    Please let me know if you solve this.

    Ciao,
    Andrea

     
  • Mickey Batman

    Mickey Batman - 2015-01-29

    shag00, I think it's better to have the parity drives directly attached to your storage server, i.e. via SATA/SCSI/USB, and data drive(s) over LAN.

    The reason is that SnapRAID makes use of system features only found on a native filesystem: http://snapraid.sourceforge.net/faq.html#fs

     

    Last edit: Mickey Batman 2015-02-01
  • shag00

    shag00 - 2015-01-30

    Thanks for your replies gents, this what I did: went and bought 6 feet of aluminum bar and built myself a custom drive case to fit between my PSU and existing drive bays and stuck 4 new HDDs in there. Looked at Leon Lai drive bays but they were just a smidgeon to wide. So LAN parity out, new SATA controller with local parity in. I want to do some renaming before I add the 2nd parity but will be in the next few days.

    Will still hope for LAN support in a future version but an extra 12TB should last me a little while.

     
  • Mickey Batman

    Mickey Batman - 2015-02-01

    I've just remembered there's a not-so-widely-used feature in the Linux kernel, that may let you have the parity drives (or any drives) over LAN. Personally I haven't used it before, and I haven't heard of anyone on this forum using it.

    https://en.wikipedia.org/wiki/Network_block_device

     
  • shag00

    shag00 - 2015-02-05

    If I read this correctly this requires both machines be linux:

    "Availability
    The network block device client module is available on Linux and GNU Hurd.
    Since the server is a userspace program, it can potentially run on every Unix-like platform. It was ported to Solaris.[1]"

    My second PC is Windows.

     

Log in to post a comment.

MongoDB Logo MongoDB