The usual suspects are bad RAM, bad controller card, bad cables or power outage during sync (power outage during sync is only a candidate when the error is in parity).
If it is always the same disk you can swap cables (including the power cable) with another HDD to see if the problem moves to the other disk instead. This also covers the possibility of bad controller if you swap cable with a disk on another controller.
To test the RAM you could start out with a tool that works with the operating system running (such as HCI Design MemTest) but they are not as reliable as memtest86 or memtest86+ which runs without OS (but not being able to use the computer for several hours to several days during the test is probably not very fun).
Most likely it is not an error in the hard drive itself since they have built in checksum and should refuse to deliver instead of delivering incorrect data. I'm fairly sure that any combination of modern hardware and OS would make the user aware about such a situation.
Last edit: Leifi Plomeros 2016-08-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
All good ideas but this is a virtual environment so the hardware issues you suggest aren't going to be present like a real computer would have them. I should have put that in the first post, I apologize for leaving that out.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Bad RAM in the host machine could still be the problem unless the host machine is using ECC RAM.
But assuming that the RAM or any other hardware is not the problem, I would focus on ruling out that you have unexpected terminations of snapraid during sync.
Is sync being done as a scheduled job? Is it possible that something else scheduled interrupts it? or that the virtual machine sometimes runs out of memory?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Lately I've been getting errors from the check command but diff and status report none. Sync seems to work well and gives no errors.
I logged the output and I'm getting this error;
parity_error:69223:parity: Data error, diff bits 5858I can do a fix and then the next check will report no errors, but in a week or so I'll have more errors again.
What's going wrong that causing these errors to happen? Any ideas would be appreciated.
The usual suspects are bad RAM, bad controller card, bad cables or power outage during sync (power outage during sync is only a candidate when the error is in parity).
If it is always the same disk you can swap cables (including the power cable) with another HDD to see if the problem moves to the other disk instead. This also covers the possibility of bad controller if you swap cable with a disk on another controller.
To test the RAM you could start out with a tool that works with the operating system running (such as HCI Design MemTest) but they are not as reliable as memtest86 or memtest86+ which runs without OS (but not being able to use the computer for several hours to several days during the test is probably not very fun).
Most likely it is not an error in the hard drive itself since they have built in checksum and should refuse to deliver instead of delivering incorrect data. I'm fairly sure that any combination of modern hardware and OS would make the user aware about such a situation.
Last edit: Leifi Plomeros 2016-08-30
All good ideas but this is a virtual environment so the hardware issues you suggest aren't going to be present like a real computer would have them. I should have put that in the first post, I apologize for leaving that out.
Bad RAM in the host machine could still be the problem unless the host machine is using ECC RAM.
But assuming that the RAM or any other hardware is not the problem, I would focus on ruling out that you have unexpected terminations of snapraid during sync.
Is sync being done as a scheduled job? Is it possible that something else scheduled interrupts it? or that the virtual machine sometimes runs out of memory?