Menu

snapraid check question

Help
Jason King
2016-08-30
2016-09-12
  • Jason King

    Jason King - 2016-08-30

    Lately I've been getting errors from the check command but diff and status report none. Sync seems to work well and gives no errors.

    I logged the output and I'm getting this error;
    parity_error:69223:parity: Data error, diff bits 5858

    I can do a fix and then the next check will report no errors, but in a week or so I'll have more errors again.

    What's going wrong that causing these errors to happen? Any ideas would be appreciated.

     
  • Leifi Plomeros

    Leifi Plomeros - 2016-08-30

    The usual suspects are bad RAM, bad controller card, bad cables or power outage during sync (power outage during sync is only a candidate when the error is in parity).

    If it is always the same disk you can swap cables (including the power cable) with another HDD to see if the problem moves to the other disk instead. This also covers the possibility of bad controller if you swap cable with a disk on another controller.

    To test the RAM you could start out with a tool that works with the operating system running (such as HCI Design MemTest) but they are not as reliable as memtest86 or memtest86+ which runs without OS (but not being able to use the computer for several hours to several days during the test is probably not very fun).

    Most likely it is not an error in the hard drive itself since they have built in checksum and should refuse to deliver instead of delivering incorrect data. I'm fairly sure that any combination of modern hardware and OS would make the user aware about such a situation.

     

    Last edit: Leifi Plomeros 2016-08-30
    • Jason King

      Jason King - 2016-09-12

      All good ideas but this is a virtual environment so the hardware issues you suggest aren't going to be present like a real computer would have them. I should have put that in the first post, I apologize for leaving that out.

       
      • Leifi Plomeros

        Leifi Plomeros - 2016-09-12

        Bad RAM in the host machine could still be the problem unless the host machine is using ECC RAM.
        But assuming that the RAM or any other hardware is not the problem, I would focus on ruling out that you have unexpected terminations of snapraid during sync.

        Is sync being done as a scheduled job? Is it possible that something else scheduled interrupts it? or that the virtual machine sometimes runs out of memory?

         

Log in to post a comment.

MongoDB Logo MongoDB