At 06:49 AM 2/22/2003, Bruce Allen wrote [edited response]:
> Can someone explain what several columns are?  What are the flags and what
> do they mean?

The flags are "proprietary" or "vendor-specific", meaning that they don't
have a fixed meaning given by the ATA/SMART specs.  However from
historical useage, the least significant bit of the flag indicates
Pre-fail (versus Usage) attributes, as indicated in the TYPE column. IBM
does document a couple of the other bits, but as far as I know they are
the only manufacturer to do so.

> What is the difference between VALUE and RAW_VALUE?

Again, please read the smartctl man page.

I had looked for this information in the man page before and not found it, so I re-read the whole thing in detail and finally found a brief mention under the discussion of the -A option, where I had not expected to find it.  Might I suggest breaking this out into it's own section?

What I did find says:

   Each Attribute has a 'Raw' value, printed under the
   heading  'RAW_VALUE',  and  a  'Normalized'   value
   printed under the heading 'VALUE'.  ...
   Each  vendor  uses  their own magic to convert this
   Raw value to a Normalized value in the range from 1
   to 254.

   Note  that  the  conversion  from  'Raw' value to a
   quantity with physical units is  not  specified  by
   the  S.M.A.R.T. standard.

Expounding on this, the raw value is the value actually reported by the disk, which smartmon then interprets via some unspecified logic to come up with a "Normalized" (how?) result that it displays as the actual VALUE, and it has to guess at this because each manufacturer uses a different method to do the conversion since its not part of the SMART spec?  Or does the manufacturer do the conversion and smartmon just reports it by reading it from the drive? What about the THRESH and WORST values?  Does smartmon produce them or read them from the disk?

> What does it mean when the threshold is zero?

This means that the attribute can never fail, since the attribute fails if
its value is less than or equal to the threshold, and the minimum
attribute value is 1.

So those attributes are only for "tracking purposes" or informational purposes?  What about something like

  3 Spin_Up_Time            0x0003   072   070   000    Pre-fail     -       0

I assume that VALUE/WORST are the only attribute values that have a minimum of 1, since the THRESH is obviously zero here, as is the raw value.  So how do I interpret this pre-fail attribute with a 0 threshold?

> 194 Temperature_Celsius     0x0022   083   079   042    Old_age      -       44
> Is this telling me the temperature is 83 degrees Celsius?  I doubt that ...

It probably IS the temperature attribute.  Again, as explained in the
smartctl man page, 44 Celsius is the drive temperature (internally, as
reported by the drive.)  The normalised value of this quantity is 83. 
This value has (at some point in the past) had value 79, which means that
the drive has in the past been hotter than it is now.  In order to "fail"
the usage threshold the normalized value has to drop to 42 or lower.

OK, thank you for the explanation. 

As I understand it then, the normalized values are then merely an algorithmic representation of the raw data, and the raw data may itself may have several meanings depending on manufacturer.  These algorithmic values have no real world representation (like the raw value might, but doesn't always), and are merely an abstract measure of how close that tracked drive attribute is to an "incorrectly operating" failure level.  The value/worst/thresh can only be compared to each other, and only for a given attribute, and the values are derived through a "magical manufacturer conversion" of the raw value.  This, then, answers my question above in that smartmon does not calculate these values, but merely reads them from the drive.

I don't really see that explanation in the man page, and it could probably use additional discussions on that topic to clarify.

> Next, on several of the drives, I get this in the headers:
> Seagate ST340016A: ATA Standard is:  Unrecognized. Minor revision code: 0x00
> WDC WD1000BB-75CHE0: ATA Standard is:  Unrecognized. Minor revision code: 0x00
> What does this mean?

It means that Seagate/WDC is not completely obeying the ATA spec.  The ATA
spec for ATA-5 (which is what the Seagate drive is) actually consists of a
dozen different revision levels.  The manufacturer can put a non-zero
number in the minor revision code, which indicates (indirectly) which of
these different revisions is the one that the drive obeys.  But Seagate
hasn't bothered to do this.

So this isn't an error/problem in the drive, but merely an unsupported feature.  Maybe there's a better way to display that so it gets across that this is just not being used by the drive, as compared to being a possible error?

> Since the Seagate tools can run and log the tests, I'm wondering why I
> can't from Linux.

That's interesting.  Could you please post the complete output of the
self-test log as reported by the Seagate DOS tool, and tell us about your
kernel version and build?

I'll do so in a separate message, as there's a lot of data there.

Second, smartd automatically enables SMART (equivalent of -s on) on the
device, "just in case" SMART has been disabled.

Perfect!  Thanks.