|
From: Marco M. <mar...@gm...> - 2018-05-11 09:20:46
|
On 05/11/2018 03:30 AM, Gandalf Corvotempesta wrote: > Il giorno ven 11 mag 2018 alle ore 03:26 Marco Milano <mar...@gm...> > ha scritto: >> It is also checked at every read. If the CRC doesn't match, >> it will self-heal from the good copy. >> So, CRC is not useless. CRC testing is an additional layer of protection >> on top of "check CRC on every read". It is not the "only" protection. > > Is not the same. > If you have a huge storage, with mostly unaccessed data, like an archive > storage or a backup storage, > most of stored data won't be accessed for year and you won't be able to > detect if something wrong is happening > on you disks. > > It's like with any RAID, every RAID is able to detect read errors and act > accordingly by reading from a different replica or by reconstructing data > from the parity. > But without a monthly/weekly scrub of the whole array, what would happen if > you have a bad sector affecting a file that you haven't read from month, > and then > a disk fails? When you are rebuilding the failed disk, you'll get the URE > on the "good" disk, resulting in data loss (data can't be reconstructed, > because the "good" disk as an unreadable sector) and a punctured raid > (happened to me many times in the past) > The scenario you describe is only likely to happen if you have only two copies and lots of unaccessed data. If you keep 3 or 4 or more copies it is not likely to happen. You can use 8+2 in version 4, which has the same redundancy level of 3 copies but the space overhead is only 25%. If you are using 2 copies now, the space overhead is 50%, 8+2 will give you more protection with much less space overhead. The only catch with EC introduced in v4 is that you have to have a lot of chunkservers, 12 chunkservers if you use 8+2. If you are losing sleep over your data currently, you can write a very simple script that will read your entire filesystem in a loop. If you have disk space, you can also set your copies to 3 for now and convert to 8+2 later. It is possible to do up to 8+8 in v4, which has a space overhead of 50% but it is the equivalent of 9 copies. The bottom line is that there are many ways to safely protect your data both in version 3 and version 4. If you provide more info about your setup, such as your hardware configuration and your data size and usage, the community can provide better suggestions. -- Marco |