From: Christian F. <Chr...@t-...> - 2008-10-30 12:09:14
|
Kai Schaetzl wrote: > I have two Samsung HD502IJ disks in a software RAID 1 array on CentOS > 5. > There are two mirrored partitions (/dev/md0, /dev/md1) on each, a > small one for the xen hypervisor system and a large one with the > remainder of the disk that is managed by LVM to get a bunch of smaller > partitions for xen guests. > Both disks show one Offline_Uncorrectable error at different offsets. > ... > The short offline tests then found the Offline_Uncorrectable errors > and I also started getting two pending sectors in sda in the smartd > messages. > However, these disappeared after some time and only the two > Offline_Uncorrectable errors remain. > ... > I read this thread > http://sourceforge.net/mailarchive/message.php? > msg_id=Pine.LNX.4.64.0806270653350.5844%40gc.phys.uwm.edu > which suggests I could copy over good data from the other disk, but > it's not clear to me at all how I find out where exactly the problem > is and how I copy the correct data over. > AFIAK, the Linux software RAID does this for you if it encounters a bad block on one of the disks: http://lxr.linux.no/linux+v2.6.27/drivers/md/raid1.c#L1621 So a raw read through the RAID driver may force the reallocation - with a probability of 50% :-) (e.g. 'ddrescue -v /dev/md0 /dev/null read.log') Note: Some older Samsung disks (at least SP1614C from P80 series) do not increment Reallocated_Sector_Ct and do not reset Offline_Uncorrectable on bad sector reallocation. I don't know whether this is the case for T- or F1-Series disks. Cheers, Christian |