Thread: [smartmontools-support] Spurious(?) warning from smartd

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I just got a warning that one of my disks is failing, and my log
contains:

> Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors

smartctl, however, seems to disagree:

> smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.14.4-200.fc20.x86_64] (local build)
> Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Hitachi Deskstar 7K1000.D
> Device Model:     Hitachi HDS721010DLE630
> Serial Number:    MSK5235H2PJ7TG
> LU WWN Device Id: 5 000cca 37ce5f7ce
> Firmware Version: MS2OA610
> User Capacity:    1,000,204,886,016 bytes [1.00 TB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    7200 rpm
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   ATA8-ACS T13/1699-D revision 4
> SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
> Local Time is:    Mon Jun  2 13:00:14 2014 CDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> 
> General SMART Values:
> Offline data collection status:  (0x84) Offline data collection activity
>                                         was suspended by an interrupting command from host.
>                                         Auto Offline Data Collection: Enabled.
> Self-test execution status:      (   0) The previous self-test routine completed
>                                         without error or no self-test has ever 
>                                         been run.
> Total time to complete Offline 
> data collection:                ( 7458) seconds.
> Offline data collection
> capabilities:                    (0x5b) SMART execute Offline immediate.
>                                         Auto Offline data collection on/off support.
>                                         Suspend Offline collection upon new
>                                         command.
>                                         Offline surface scan supported.
>                                         Self-test supported.
>                                         No Conveyance Self-test supported.
>                                         Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
>                                         power-saving mode.
>                                         Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
>                                         General Purpose Logging supported.
> Short self-test routine 
> recommended polling time:        (   1) minutes.
> Extended self-test routine
> recommended polling time:        ( 125) minutes.
> SCT capabilities:              (0x003d) SCT Status supported.
>                                         SCT Error Recovery Control supported.
>                                         SCT Feature Control supported.
>                                         SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x000b   098   098   016    Pre-fail  Always       -       262144
>   2 Throughput_Performance  0x0005   140   140   054    Pre-fail  Offline      -       75
>   3 Spin_Up_Time            0x0007   118   118   024    Pre-fail  Always       -       194 (Average 194)
>   4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       79
>   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
>   7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
>   8 Seek_Time_Performance   0x0005   113   113   020    Pre-fail  Offline      -       35
>   9 Power_On_Hours          0x0012   098   098   000    Old_age   Always       -       15014
>  10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
>  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       77
> 192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       132
> 193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       132
> 194 Temperature_Celsius     0x0002   200   200   000    Old_age   Always       -       30 (Min/Max 21/34)
> 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
> 197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
> 198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
> 199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
> 
> SMART Error Log Version: 1
> ATA Error Count: 1
>         CR = Command Register [HEX]
>         FR = Features Register [HEX]
>         SC = Sector Count Register [HEX]
>         SN = Sector Number Register [HEX]
>         CL = Cylinder Low Register [HEX]
>         CH = Cylinder High Register [HEX]
>         DH = Device/Head Register [HEX]
>         DC = Device Command Register [HEX]
>         ER = Error register [HEX]
>         ST = Status register [HEX]
> Powered_Up_Time is measured from power on, and printed as
> DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
> SS=sec, and sss=millisec. It "wraps" after 49.710 days.
> 
> Error 1 occurred at disk power-on lifetime: 6249 hours (260 days + 9 hours)
>   When the command that caused the error occurred, the device was active or idle.
> 
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   40 51 20 60 4b c5 06  Error: UNC at LBA = 0x06c54b60 = 113593184
> 
>   Commands leading to the command that caused the error were:
>   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
>   -- -- -- -- -- -- -- --  ----------------  --------------------
>   60 80 40 80 59 c5 40 00   5d+04:50:01.896  READ FPDMA QUEUED
>   60 80 38 00 59 c5 40 00   5d+04:50:01.895  READ FPDMA QUEUED
>   60 80 30 80 58 c5 40 00   5d+04:50:01.895  READ FPDMA QUEUED
>   60 80 28 00 58 c5 40 00   5d+04:50:01.895  READ FPDMA QUEUED
>   60 80 20 80 57 c5 40 00   5d+04:50:01.895  READ FPDMA QUEUED
> 
> SMART Self-test log structure revision number 1
> No self-tests have been logged.  [To run self-tests, use: smartctl -t]
> 
> 
> SMART Selective self-test log data structure revision number 1
>  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>     1        0        0  Not_testing
>     2        0        0  Not_testing
>     3        0        0  Not_testing
>     4        0        0  Not_testing
>     5        0        0  Not_testing
> Selective self-test flags (0x0):
>   After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.

Any idea what's going on?

Thanks!

-- 
========================================================================
Ian Pilcher                                         are...@gm...
           Sent from the cloud -- where it's already tomorrow
========================================================================

Thread: [smartmontools-support] Spurious(?) warning from smartd

Disk Inspection and Monitoring

smartmontools-support