Thread: [smartmontools-support]UNC's cause for worry?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hello everyone,

I strongly suspect my PowerBook's hard drive (IBM TravelStar 20GB) is 
developing problems. The simple solution is to just buy a new drive, 
but as a student, I can't afford to do that unless it's absolutely 
necessary. Incidentally, my AppleCare warranty expired three days 
before the problem started.

The main symptom is that read/write tasks sometimes stall for 5-30 
seconds, and the hard drive makes the same sequence of clicking and 
seeking noises over and over. The noises themselves are not too 
unusual, but the repetition of the same noise pattern, combined with 
the I/O delay, is suspicious (sounds like it's repeatedly trying to 
read the same block, or perhaps is recalibrating itself). None of the 
OS X disk utilities show any problems.

 From running smartctl, it seems that the drive is logging an UNC 
(unrecoverable) error on most of these stall/strange noise occasions. 
So far, I haven't actually gotten any I/O error messages from the OS, 
which leads me to think that the drive is eventually able to read the 
data. Looking at the smartctl output, I have a "raw" count of 38 for 
"Current_Pending_Sector"--it just increased from 37 after one of these 
stall incidents. The trick in the FAQ for forcing a write to the bad 
sector doesn't seem applicable to OS X. Running a long selftest finds 
nothing (offline and short tests are not supported).

It also seems like errors are occuring more frequently than average. My 
drive shows a total of 439 errors over 6835 power-on hours. That's an 
average of 15.6 hours between errors. However, the last five errors 
were an average of 9.4 hours apart.

So, is it game over for my drive? Am I living on borrowed time? Please 
be sure to email me at trbeals at berkeley followed by edu, as I'm not 
on the list. Thanks!

-Travis

Here's the output from smartctl -a disk0:

smartctl version 5.33 [powerpc-apple-darwin7.6.0] Copyright (C) 2002-4 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     IBM-IC25N020ATDA04-0
Serial Number:    63A63135398
Firmware Version: DA3AA72A
User Capacity:    20,003,880,960 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   5
ATA Standard is:  ATA/ATAPI-5 T13 1321D revision 3
Local Time is:    Thu Nov 11 11:05:44 2004 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine 
completed
					without error or no self-test has ever
					been run.
Total time to complete Offline
data collection: 		 ( 645) seconds.
Offline data collection
capabilities: 			 (0x1b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					No Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					No General Purpose Logging support.
Short self-test routine
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  27) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000b   085   085   062    Pre-fail  
Always       -       16384012
   2 Throughput_Performance  0x0005   100   100   040    Pre-fail  
Offline      -       0
   3 Spin_Up_Time            0x0007   142   142   033    Pre-fail  
Always       -       1
   4 Start_Stop_Count        0x0012   094   094   000    Old_age   
Always       -       10363
   5 Reallocated_Sector_Ct   0x0033   095   095   005    Pre-fail  
Always       -       0
   7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  
Always       -       0
   8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  
Offline      -       0
   9 Power_On_Hours          0x0012   085   085   000    Old_age   
Always       -       6835
  10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  
Always       -       0
  12 Power_Cycle_Count       0x0032   096   096   000    Old_age   
Always       -       7153
191 G-Sense_Error_Rate      0x000a   098   098   000    Old_age   
Always       -       262145
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   
Always       -       48
193 Load_Cycle_Count        0x0012   065   065   000    Old_age   
Always       -       352433
194 Temperature_Celsius     0x0002   189   189   000    Old_age   
Always       -       29 (Lifetime Min/Max 13/55)
196 Reallocated_Event_Count 0x0032   088   088   000    Old_age   
Always       -       698
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   
Always       -       38
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   
Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   
Always       -       0

SMART Error Log Version: 1
ATA Error Count: 439 (device log contains only the most recent five 
errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 439 occurred at disk power-on lifetime: 6834 hours (284 days + 18 
hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 36 4a e6 08 e0  Error: UNC 54 sectors at LBA = 0x0008e64a = 
583242

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 40 40 e6 08 e0 00      00:00:06.800  READ DMA
   ca 00 00 16 4e 00 e0 00      00:00:06.800  WRITE DMA
   ca 00 05 a0 02 35 e1 00      00:00:06.800  WRITE DMA
   c8 00 35 28 36 22 e0 00      00:00:06.800  READ DMA
   ca 00 01 38 b7 60 e1 00      00:00:06.800  WRITE DMA

Error 438 occurred at disk power-on lifetime: 6834 hours (284 days + 18 
hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 36 4a e6 08 e0  Error: UNC 54 sectors at LBA = 0x0008e64a = 
583242

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 40 40 e6 08 e0 00      00:00:02.700  READ DMA
   c8 00 03 f0 80 0d e2 00      00:00:02.600  READ DMA
   c8 00 10 d0 bd 03 e0 00      00:00:02.600  READ DMA
   c8 00 10 10 93 00 e0 00      00:00:02.500  READ DMA
   c8 00 03 00 7f 0d e2 00      00:00:02.500  READ DMA

Error 437 occurred at disk power-on lifetime: 6825 hours (284 days + 9 
hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 08 40 c3 54 e0  Error: UNC 8 sectors at LBA = 0x0054c340 = 
5555008

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 08 40 c3 54 e0 00      00:00:02.600  READ DMA
   c8 00 08 e0 13 52 e0 00      00:00:02.600  READ DMA
   c8 00 40 40 a3 52 e0 00      00:00:02.500  READ DMA
   c8 00 08 50 ab 3b e2 00      00:00:02.500  READ DMA
   c8 00 40 00 a3 52 e0 00      00:00:02.500  READ DMA

Error 436 occurred at disk power-on lifetime: 6807 hours (283 days + 15 
hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 06 d1 2a 1d e0  Error: UNC 6 sectors at LBA = 0x001d2ad1 = 
1911505

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 07 d0 2a 1d e0 00      00:00:49.000  READ DMA
   c8 00 08 28 99 0d e1 00      00:00:48.900  READ DMA
   c8 00 08 b8 90 15 e1 00      00:00:48.900  READ DMA
   c8 00 08 b8 2a 1d e0 00      00:00:41.200  READ DMA
   c8 00 08 28 97 0d e1 00      00:00:41.200  READ DMA

Error 435 occurred at disk power-on lifetime: 6787 hours (282 days + 19 
hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 1c a4 80 01 e0  Error: UNC 28 sectors at LBA = 0x000180a4 = 
98468

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   c8 00 20 a0 80 01 e0 00      00:00:45.900  READ DMA
   ef 03 22 00 00 00 a0 00      00:00:45.900  SET FEATURES [Set transfer 
mode]
   c8 00 20 a0 80 01 e0 00      00:00:14.900  READ DMA
   c8 00 20 30 ad 02 e0 00      00:00:14.800  READ DMA
   c8 00 20 d0 64 03 e0 00      00:00:14.600  READ DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      6830    
      -

Device does not support Selective Self Tests/Logging

Thread: [smartmontools-support]UNC's cause for worry?

Disk Inspection and Monitoring

smartmontools-support