From: Bruce Allen <ballen@gr...> - 2002-12-30 05:37:01
First off, thanks a lot for moderating!
On Mon, 30 Dec 2002, [ISO-8859-1] Fr=E9d=E9ric L. W. Meunier wrote:
> On Sun, 29 Dec 2002, Bruce Allen wrote:
> > The ECC_Recovered raw values look fine. On the Maxtor disks that we ha=
> > on our computing cluster, some of the disks have values in the range
> > around 10^10 (these are 6-byte integers!). But from your disk it looks
> > as if they have changed this raw value to be a rate -- it goes up and
> > down...
> Device Model: MAXTOR 6L060J3
> Serial Number: 663200252994
> Firmware Version: A93.0500
> 195 Hardware_ECC_Recovered 0x001a 100 006 000 Old_age - =
> The strange (?) thing is that I get the WORST (and very low VALUE)
> numbers after a weekly updatedb. hdparm -tT /dev/hda will always make
> VALUE go to 100 again.
OK, here's what I think is going on. Remember, I am not an expert merely
an enthusiastic amateur.
First, the ECC refers to the error correction coding that is part of what
makes hard disks and CDs work. When reading bits off the media, there is
always some fraction of bits that are read in error. Fortunately, the
disk stores redundant information that can be used to correct and detect
these errors. Typically the amount of redundant info is choosen to give
undetected error rates of one bit in 10^14 bits.
Now, when you are finished doing an updatedb, having essentially read the
entire disk, the rate of detected/corrected errors is probably a maximum.
And, when you have done the hdparm -tT test, the final part of the test,
that reads from a disk buffer (not from the media) reduces the error rate
back to zero.
Moral of the story: you guys probably shouldn't worry so much about this