From: <th...@bo...> - 2004-09-16 17:57:44
|
HI! My Samsung SP1604N is exhibiting some bad behaviour lately every now and then (not always). When it happens, reading from disk takes much longer than normally and the hard disk LED is almost always on while acessing. When access ends, HD LED goes off normally. Long selftest of the HD did not reveal any problems. Raw_Read_Error_Rate (+60000) and Soft_Read_Error_Rate (+250) have increased much during the last week. Is the HD going to die? Attaching the current output of smartctl -a. Thanks! Thomas smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: SAMSUNG SP1604N Serial Number: S013J10WC92543 Firmware Version: TM100-24 Device is: In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 Local Time is: Thu Sep 16 19:55:27 2004 CEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (5760) seconds. Offline data collection capabilities: (0x1b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. No Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 96) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 095 095 051 Pre-fail Always - 77458 3 Spin_Up_Time 0x0007 066 056 000 Pre-fail Always - 5824 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 104 5 Reallocated_Sector_Ct 0x0033 253 253 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 253 253 051 Pre-fail Always - 0 8 Seek_Time_Performance 0x0024 253 253 000 Old_age Offline - 0 9 Power_On_Half_Minutes 0x0032 100 100 000 Old_age Always - 4361h+46m 10 Spin_Retry_Count 0x0013 253 253 049 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 57 194 Temperature_Celsius 0x0022 115 070 000 Old_age Always - 41 195 Hardware_ECC_Recovered 0x000a 100 100 000 Old_age Always - 484330054 196 Reallocated_Event_Count 0x0012 253 253 000 Old_age Always - 0 197 Current_Pending_Sector 0x0033 253 253 010 Pre-fail Always - 0 198 Offline_Uncorrectable 0x0031 253 253 010 Pre-fail Offline - 0 199 UDMA_CRC_Error_Count 0x000b 100 100 051 Pre-fail Always - 0 200 Multi_Zone_Error_Rate 0x000b 100 100 051 Pre-fail Always - 0 201 Soft_Read_Error_Rate 0x000b 100 100 051 Pre-fail Always - 321 SMART Error Log Version: 1 Warning: ATA error count 512 inconsistent with error log pointer 5 ATA Error Count: 512 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Timestamp = decimal seconds since the previous disk power-on. Note: timestamp "wraps" after 2^32 msec = 49.710 days. Error 512 occurred at disk power-on lifetime: 108 hours When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 51 00 00 4f c2 f0 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name -- -- -- -- -- -- -- -- --------- -------------------- b0 da 00 00 4f c2 f0 00 304662.313 SMART RETURN STATUS ec 00 00 7c 9d bf f0 00 304662.313 IDENTIFY DEVICE ec 00 01 00 00 00 f0 00 248808.313 IDENTIFY DEVICE ec 00 01 00 00 00 f0 00 248808.313 IDENTIFY DEVICE ec 00 01 00 00 00 f0 00 248808.250 IDENTIFY DEVICE Error 511 occurred at disk power-on lifetime: 0 hours When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 04 51 00 01 00 00 a0 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name -- -- -- -- -- -- -- -- --------- -------------------- b1 c0 00 01 00 00 a0 00 11.875 DEVICE CONFIGURATION RESTORE ec 00 03 01 00 00 a0 00 11.875 IDENTIFY DEVICE 91 00 3f 01 00 00 af 00 11.875 INITIALIZE DEVICE PARAMETERS [OBS-6] 10 00 00 01 00 00 a0 00 11.875 RECALIBRATE [OBS-4] ec 00 01 01 00 00 a0 00 11.875 IDENTIFY DEVICE SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 4245 - |
From: Bruce A. <ba...@gr...> - 2004-09-17 05:19:00
|
Disk looks OK (at least for now). Suggest that you update to smartmontools version 5.32, then use the smartd '-s' Directive to start doing regular scheduled long self-tests of the disk, perhaps once or twice a week. I also suggest that if you don't do regular backups, now is an excellent time to start. A cheap solution is to buy a second disk and make backups to that. Cheers,=09 Bruce On Thu, 16 Sep 2004, Thomas B=F6rkel wrote: > HI! >=20 > My Samsung SP1604N is exhibiting some bad behaviour lately every now and= =20 > then (not always). >=20 > When it happens, reading from disk takes much longer than normally and=20 > the hard disk LED is almost always on while acessing. When access ends,= =20 > HD LED goes off normally. >=20 > Long selftest of the HD did not reveal any problems. >=20 > Raw_Read_Error_Rate (+60000) and Soft_Read_Error_Rate (+250) have=20 > increased much during the last week. Is the HD going to die? >=20 > Attaching the current output of smartctl -a. >=20 > Thanks! >=20 > Thomas >=20 >=20 > smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ >=20 > =3D=3D=3D START OF INFORMATION SECTION =3D=3D=3D > Device Model: SAMSUNG SP1604N > Serial Number: S013J10WC92543 > Firmware Version: TM100-24 > Device is: In smartctl database [for details use: -P show] > ATA Version is: 7 > ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 > Local Time is: Thu Sep 16 19:55:27 2004 CEST > SMART support is: Available - device has SMART capability. > SMART support is: Enabled >=20 > =3D=3D=3D START OF READ SMART DATA SECTION =3D=3D=3D > SMART overall-health self-assessment test result: PASSED >=20 > General SMART Values: > Offline data collection status: (0x00) Offline data collection activity = was > never started. > Auto Offline Data Collection:=20 > Disabled. > Self-test execution status: ( 0) The previous self-test routine=20 > completed > without error or no self-test=20 > has ever > been run. > Total time to complete Offline > data collection: (5760) seconds. > Offline data collection > capabilities: (0x1b) SMART execute Offline immediate. > Auto Offline data collection=20 > on/off support. > Suspend Offline collection upon = new > command. > Offline surface scan supported. > Self-test supported. > No Conveyance Self-test supporte= d. > No Selective Self-test supported= =2E > SMART capabilities: (0x0003) Saves SMART data before entering > power-saving mode. > Supports SMART auto save timer. > Error logging capability: (0x01) Error logging supported. > No General Purpose Logging suppo= rt. > Short self-test routine > recommended polling time: ( 1) minutes. > Extended self-test routine > recommended polling time: ( 96) minutes. >=20 > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE=20 > UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000b 095 095 051 Pre-fail=20 > Always - 77458 > 3 Spin_Up_Time 0x0007 066 056 000 Pre-fail=20 > Always - 5824 > 4 Start_Stop_Count 0x0032 100 100 000 Old_age=20 > Always - 104 > 5 Reallocated_Sector_Ct 0x0033 253 253 010 Pre-fail=20 > Always - 0 > 7 Seek_Error_Rate 0x000b 253 253 051 Pre-fail=20 > Always - 0 > 8 Seek_Time_Performance 0x0024 253 253 000 Old_age=20 > Offline - 0 > 9 Power_On_Half_Minutes 0x0032 100 100 000 Old_age=20 > Always - 4361h+46m > 10 Spin_Retry_Count 0x0013 253 253 049 Pre-fail=20 > Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age=20 > Always - 57 > 194 Temperature_Celsius 0x0022 115 070 000 Old_age Always= =20 > - 41 > 195 Hardware_ECC_Recovered 0x000a 100 100 000 Old_age Always= =20 > - 484330054 > 196 Reallocated_Event_Count 0x0012 253 253 000 Old_age Always= =20 > - 0 > 197 Current_Pending_Sector 0x0033 253 253 010 Pre-fail Always= =20 > - 0 > 198 Offline_Uncorrectable 0x0031 253 253 010 Pre-fail=20 > Offline - 0 > 199 UDMA_CRC_Error_Count 0x000b 100 100 051 Pre-fail Always= =20 > - 0 > 200 Multi_Zone_Error_Rate 0x000b 100 100 051 Pre-fail Always= =20 > - 0 > 201 Soft_Read_Error_Rate 0x000b 100 100 051 Pre-fail Always= =20 > - 321 >=20 > SMART Error Log Version: 1 > Warning: ATA error count 512 inconsistent with error log pointer 5 >=20 > ATA Error Count: 512 (device log contains only the most recent five error= s) > CR =3D Command Register [HEX] > FR =3D Features Register [HEX] > SC =3D Sector Count Register [HEX] > SN =3D Sector Number Register [HEX] > CL =3D Cylinder Low Register [HEX] > CH =3D Cylinder High Register [HEX] > DH =3D Device/Head Register [HEX] > DC =3D Device Command Register [HEX] > ER =3D Error register [HEX] > ST =3D Status register [HEX] > Timestamp =3D decimal seconds since the previous disk power-on. > Note: timestamp "wraps" after 2^32 msec =3D 49.710 days. >=20 > Error 512 occurred at disk power-on lifetime: 108 hours > When the command that caused the error occurred, the device was=20 > active or idle. >=20 > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 04 51 00 00 4f c2 f0 >=20 > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name > -- -- -- -- -- -- -- -- --------- -------------------- > b0 da 00 00 4f c2 f0 00 304662.313 SMART RETURN STATUS > ec 00 00 7c 9d bf f0 00 304662.313 IDENTIFY DEVICE > ec 00 01 00 00 00 f0 00 248808.313 IDENTIFY DEVICE > ec 00 01 00 00 00 f0 00 248808.313 IDENTIFY DEVICE > ec 00 01 00 00 00 f0 00 248808.250 IDENTIFY DEVICE >=20 > Error 511 occurred at disk power-on lifetime: 0 hours > When the command that caused the error occurred, the device was=20 > active or idle. >=20 > After command completion occurred, registers were: > ER ST SC SN CL CH DH > -- -- -- -- -- -- -- > 04 51 00 01 00 00 a0 >=20 > Commands leading to the command that caused the error were: > CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name > -- -- -- -- -- -- -- -- --------- -------------------- > b1 c0 00 01 00 00 a0 00 11.875 DEVICE CONFIGURATION RESTORE > ec 00 03 01 00 00 a0 00 11.875 IDENTIFY DEVICE > 91 00 3f 01 00 00 af 00 11.875 INITIALIZE DEVICE PARAMETERS [OBS= -6] > 10 00 00 01 00 00 a0 00 11.875 RECALIBRATE [OBS-4] > ec 00 01 01 00 00 a0 00 11.875 IDENTIFY DEVICE >=20 > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining=20 > LifeTime(hours) LBA_of_first_error > # 1 Extended offline Completed without error 00% 4245=20 > - >=20 >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 > Project Admins to receive an Apple iPod Mini FREE for your judgement on > who ports your project to Linux PPC the best. Sponsored by IBM. > Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support >=20 >=20 |
From: <th...@bo...> - 2004-09-17 05:36:22
|
HI! Thanks for the fast reply. Isn't it alarming, that: - sometimes reading data takes up to 10 times longer (top shows no wild process running) - Raw_Read_Error_Rate has increased from 10000 to 70000 in 1 week - Soft_Read_Error_Rate has increased from 70 to 320 in 1 week The drive is 7 months old and started this strange behaviour about 3=20 weeks ago. I am doing regular backups, so I would not loose much, if it would die.=20 But those drastically increased read times every now and then are pretty=20 annoying. Thanks! Regards, Thomas Bruce Allen wrote: > Disk looks OK (at least for now). >=20 > Suggest that you update to smartmontools version 5.32, then use the sma= rtd > '-s' Directive to start doing regular scheduled long self-tests of the > disk, perhaps once or twice a week. >=20 > I also suggest that if you don't do regular backups, now is an excellen= t > time to start. A cheap solution is to buy a second disk and make backu= ps > to that. >=20 > Cheers,=09 > Bruce >=20 > On Thu, 16 Sep 2004, Thomas B=F6rkel wrote: >=20 >=20 >>HI! >> >>My Samsung SP1604N is exhibiting some bad behaviour lately every now an= d=20 >>then (not always). >> >>When it happens, reading from disk takes much longer than normally and=20 >>the hard disk LED is almost always on while acessing. When access ends,= =20 >>HD LED goes off normally. >> >>Long selftest of the HD did not reveal any problems. >> >>Raw_Read_Error_Rate (+60000) and Soft_Read_Error_Rate (+250) have=20 >>increased much during the last week. Is the HD going to die? >> >>Attaching the current output of smartctl -a. >> >>Thanks! >> >>Thomas >> >> >>smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen >>Home page is http://smartmontools.sourceforge.net/ >> >>=3D=3D=3D START OF INFORMATION SECTION =3D=3D=3D >>Device Model: SAMSUNG SP1604N >>Serial Number: S013J10WC92543 >>Firmware Version: TM100-24 >>Device is: In smartctl database [for details use: -P show] >>ATA Version is: 7 >>ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 >>Local Time is: Thu Sep 16 19:55:27 2004 CEST >>SMART support is: Available - device has SMART capability. >>SMART support is: Enabled >> >>=3D=3D=3D START OF READ SMART DATA SECTION =3D=3D=3D >>SMART overall-health self-assessment test result: PASSED >> >>General SMART Values: >>Offline data collection status: (0x00) Offline data collection activit= y was >> never started. >> Auto Offline Data Collection:=20 >>Disabled. >>Self-test execution status: ( 0) The previous self-test routine=20 >>completed >> without error or no self-test=20 >>has ever >> been run. >>Total time to complete Offline >>data collection: (5760) seconds. >>Offline data collection >>capabilities: (0x1b) SMART execute Offline immediate. >> Auto Offline data collection=20 >>on/off support. >> Suspend Offline collection upo= n new >> command. >> Offline surface scan supported. >> Self-test supported. >> No Conveyance Self-test suppor= ted. >> No Selective Self-test support= ed. >>SMART capabilities: (0x0003) Saves SMART data before enterin= g >> power-saving mode. >> Supports SMART auto save timer. >>Error logging capability: (0x01) Error logging supported. >> No General Purpose Logging sup= port. >>Short self-test routine >>recommended polling time: ( 1) minutes. >>Extended self-test routine >>recommended polling time: ( 96) minutes. >> >>SMART Attributes Data Structure revision number: 16 >>Vendor Specific SMART Attributes with Thresholds: >>ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE=20 >>UPDATED WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000b 095 095 051 Pre-fail=20 >>Always - 77458 >> 3 Spin_Up_Time 0x0007 066 056 000 Pre-fail=20 >>Always - 5824 >> 4 Start_Stop_Count 0x0032 100 100 000 Old_age=20 >>Always - 104 >> 5 Reallocated_Sector_Ct 0x0033 253 253 010 Pre-fail=20 >>Always - 0 >> 7 Seek_Error_Rate 0x000b 253 253 051 Pre-fail=20 >>Always - 0 >> 8 Seek_Time_Performance 0x0024 253 253 000 Old_age=20 >>Offline - 0 >> 9 Power_On_Half_Minutes 0x0032 100 100 000 Old_age=20 >>Always - 4361h+46m >> 10 Spin_Retry_Count 0x0013 253 253 049 Pre-fail=20 >>Always - 0 >> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age=20 >>Always - 57 >>194 Temperature_Celsius 0x0022 115 070 000 Old_age Alway= s=20 >> - 41 >>195 Hardware_ECC_Recovered 0x000a 100 100 000 Old_age Alway= s=20 >> - 484330054 >>196 Reallocated_Event_Count 0x0012 253 253 000 Old_age Alway= s=20 >> - 0 >>197 Current_Pending_Sector 0x0033 253 253 010 Pre-fail Alway= s=20 >> - 0 >>198 Offline_Uncorrectable 0x0031 253 253 010 Pre-fail=20 >>Offline - 0 >>199 UDMA_CRC_Error_Count 0x000b 100 100 051 Pre-fail Alway= s=20 >> - 0 >>200 Multi_Zone_Error_Rate 0x000b 100 100 051 Pre-fail Alway= s=20 >> - 0 >>201 Soft_Read_Error_Rate 0x000b 100 100 051 Pre-fail Alway= s=20 >> - 321 >> >>SMART Error Log Version: 1 >>Warning: ATA error count 512 inconsistent with error log pointer 5 >> >>ATA Error Count: 512 (device log contains only the most recent five err= ors) >> CR =3D Command Register [HEX] >> FR =3D Features Register [HEX] >> SC =3D Sector Count Register [HEX] >> SN =3D Sector Number Register [HEX] >> CL =3D Cylinder Low Register [HEX] >> CH =3D Cylinder High Register [HEX] >> DH =3D Device/Head Register [HEX] >> DC =3D Device Command Register [HEX] >> ER =3D Error register [HEX] >> ST =3D Status register [HEX] >>Timestamp =3D decimal seconds since the previous disk power-on. >>Note: timestamp "wraps" after 2^32 msec =3D 49.710 days. >> >>Error 512 occurred at disk power-on lifetime: 108 hours >> When the command that caused the error occurred, the device was=20 >>active or idle. >> >> After command completion occurred, registers were: >> ER ST SC SN CL CH DH >> -- -- -- -- -- -- -- >> 04 51 00 00 4f c2 f0 >> >> Commands leading to the command that caused the error were: >> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name >> -- -- -- -- -- -- -- -- --------- -------------------- >> b0 da 00 00 4f c2 f0 00 304662.313 SMART RETURN STATUS >> ec 00 00 7c 9d bf f0 00 304662.313 IDENTIFY DEVICE >> ec 00 01 00 00 00 f0 00 248808.313 IDENTIFY DEVICE >> ec 00 01 00 00 00 f0 00 248808.313 IDENTIFY DEVICE >> ec 00 01 00 00 00 f0 00 248808.250 IDENTIFY DEVICE >> >>Error 511 occurred at disk power-on lifetime: 0 hours >> When the command that caused the error occurred, the device was=20 >>active or idle. >> >> After command completion occurred, registers were: >> ER ST SC SN CL CH DH >> -- -- -- -- -- -- -- >> 04 51 00 01 00 00 a0 >> >> Commands leading to the command that caused the error were: >> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name >> -- -- -- -- -- -- -- -- --------- -------------------- >> b1 c0 00 01 00 00 a0 00 11.875 DEVICE CONFIGURATION RESTORE >> ec 00 03 01 00 00 a0 00 11.875 IDENTIFY DEVICE >> 91 00 3f 01 00 00 af 00 11.875 INITIALIZE DEVICE PARAMETERS [O= BS-6] >> 10 00 00 01 00 00 a0 00 11.875 RECALIBRATE [OBS-4] >> ec 00 01 01 00 00 a0 00 11.875 IDENTIFY DEVICE >> >>SMART Self-test log structure revision number 1 >>Num Test_Description Status Remaining=20 >>LifeTime(hours) LBA_of_first_error >># 1 Extended offline Completed without error 00% 4245=20 >> - >> >> >> >>------------------------------------------------------- >>This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 >>Project Admins to receive an Apple iPod Mini FREE for your judgement on >>who ports your project to Linux PPC the best. Sponsored by IBM. >>Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php >>_______________________________________________ >>Smartmontools-support mailing list >>Sma...@li... >>https://lists.sourceforge.net/lists/listinfo/smartmontools-support >> >> >=20 >=20 >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 > Project Admins to receive an Apple iPod Mini FREE for your judgement on > who ports your project to Linux PPC the best. Sponsored by IBM. > Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support |
From: Bruce A. <ba...@gr...> - 2004-09-17 08:33:25
|
On Fri, 17 Sep 2004, Thomas B=F6rkel wrote: > HI! >=20 > Thanks for the fast reply. >=20 > Isn't it alarming, that: > - sometimes reading data takes up to 10 times longer > (top shows no wild process running) > - Raw_Read_Error_Rate has increased from 10000 to 70000 in 1 week > - Soft_Read_Error_Rate has increased from 70 to 320 in 1 week Yes, this does suggest that something is failing. Definitely keep an eye on these, and start running long self-tests on a regular basis. =20 > The drive is 7 months old and started this strange behaviour about 3=20 > weeks ago. >=20 > I am doing regular backups, so I would not loose much, if it would > die. But those drastically increased read times every now and then > are pretty annoying. Is the drive temperature fairly stable? Cheers,=09 Bruce > Bruce Allen wrote: > > Disk looks OK (at least for now). > >=20 > > Suggest that you update to smartmontools version 5.32, then use the sma= rtd > > '-s' Directive to start doing regular scheduled long self-tests of the > > disk, perhaps once or twice a week. > >=20 > > I also suggest that if you don't do regular backups, now is an excellen= t > > time to start. A cheap solution is to buy a second disk and make backu= ps > > to that. > >=20 > > Cheers,=09 > > Bruce > >=20 > > On Thu, 16 Sep 2004, Thomas B=F6rkel wrote: > >=20 > >=20 > >>HI! > >> > >>My Samsung SP1604N is exhibiting some bad behaviour lately every now an= d=20 > >>then (not always). > >> > >>When it happens, reading from disk takes much longer than normally and= =20 > >>the hard disk LED is almost always on while acessing. When access ends,= =20 > >>HD LED goes off normally. > >> > >>Long selftest of the HD did not reveal any problems. > >> > >>Raw_Read_Error_Rate (+60000) and Soft_Read_Error_Rate (+250) have=20 > >>increased much during the last week. Is the HD going to die? > >> > >>Attaching the current output of smartctl -a. > >> > >>Thanks! > >> > >>Thomas > >> > >> > >>smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen > >>Home page is http://smartmontools.sourceforge.net/ > >> > >>=3D=3D=3D START OF INFORMATION SECTION =3D=3D=3D > >>Device Model: SAMSUNG SP1604N > >>Serial Number: S013J10WC92543 > >>Firmware Version: TM100-24 > >>Device is: In smartctl database [for details use: -P show] > >>ATA Version is: 7 > >>ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 > >>Local Time is: Thu Sep 16 19:55:27 2004 CEST > >>SMART support is: Available - device has SMART capability. > >>SMART support is: Enabled > >> > >>=3D=3D=3D START OF READ SMART DATA SECTION =3D=3D=3D > >>SMART overall-health self-assessment test result: PASSED > >> > >>General SMART Values: > >>Offline data collection status: (0x00) Offline data collection activit= y was > >> never started. > >> Auto Offline Data Collection:= =20 > >>Disabled. > >>Self-test execution status: ( 0) The previous self-test routine= =20 > >>completed > >> without error or no self-test= =20 > >>has ever > >> been run. > >>Total time to complete Offline > >>data collection: (5760) seconds. > >>Offline data collection > >>capabilities: (0x1b) SMART execute Offline immediate= =2E > >> Auto Offline data collection= =20 > >>on/off support. > >> Suspend Offline collection upo= n new > >> command. > >> Offline surface scan supported= =2E > >> Self-test supported. > >> No Conveyance Self-test suppor= ted. > >> No Selective Self-test support= ed. > >>SMART capabilities: (0x0003) Saves SMART data before enterin= g > >> power-saving mode. > >> Supports SMART auto save timer= =2E > >>Error logging capability: (0x01) Error logging supported. > >> No General Purpose Logging sup= port. > >>Short self-test routine > >>recommended polling time: ( 1) minutes. > >>Extended self-test routine > >>recommended polling time: ( 96) minutes. > >> > >>SMART Attributes Data Structure revision number: 16 > >>Vendor Specific SMART Attributes with Thresholds: > >>ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE=20 > >>UPDATED WHEN_FAILED RAW_VALUE > >> 1 Raw_Read_Error_Rate 0x000b 095 095 051 Pre-fail=20 > >>Always - 77458 > >> 3 Spin_Up_Time 0x0007 066 056 000 Pre-fail=20 > >>Always - 5824 > >> 4 Start_Stop_Count 0x0032 100 100 000 Old_age=20 > >>Always - 104 > >> 5 Reallocated_Sector_Ct 0x0033 253 253 010 Pre-fail=20 > >>Always - 0 > >> 7 Seek_Error_Rate 0x000b 253 253 051 Pre-fail=20 > >>Always - 0 > >> 8 Seek_Time_Performance 0x0024 253 253 000 Old_age=20 > >>Offline - 0 > >> 9 Power_On_Half_Minutes 0x0032 100 100 000 Old_age=20 > >>Always - 4361h+46m > >> 10 Spin_Retry_Count 0x0013 253 253 049 Pre-fail=20 > >>Always - 0 > >> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age=20 > >>Always - 57 > >>194 Temperature_Celsius 0x0022 115 070 000 Old_age Alway= s=20 > >> - 41 > >>195 Hardware_ECC_Recovered 0x000a 100 100 000 Old_age Alway= s=20 > >> - 484330054 > >>196 Reallocated_Event_Count 0x0012 253 253 000 Old_age Alway= s=20 > >> - 0 > >>197 Current_Pending_Sector 0x0033 253 253 010 Pre-fail Alway= s=20 > >> - 0 > >>198 Offline_Uncorrectable 0x0031 253 253 010 Pre-fail=20 > >>Offline - 0 > >>199 UDMA_CRC_Error_Count 0x000b 100 100 051 Pre-fail Alway= s=20 > >> - 0 > >>200 Multi_Zone_Error_Rate 0x000b 100 100 051 Pre-fail Alway= s=20 > >> - 0 > >>201 Soft_Read_Error_Rate 0x000b 100 100 051 Pre-fail Alway= s=20 > >> - 321 > >> > >>SMART Error Log Version: 1 > >>Warning: ATA error count 512 inconsistent with error log pointer 5 > >> > >>ATA Error Count: 512 (device log contains only the most recent five err= ors) > >> CR =3D Command Register [HEX] > >> FR =3D Features Register [HEX] > >> SC =3D Sector Count Register [HEX] > >> SN =3D Sector Number Register [HEX] > >> CL =3D Cylinder Low Register [HEX] > >> CH =3D Cylinder High Register [HEX] > >> DH =3D Device/Head Register [HEX] > >> DC =3D Device Command Register [HEX] > >> ER =3D Error register [HEX] > >> ST =3D Status register [HEX] > >>Timestamp =3D decimal seconds since the previous disk power-on. > >>Note: timestamp "wraps" after 2^32 msec =3D 49.710 days. > >> > >>Error 512 occurred at disk power-on lifetime: 108 hours > >> When the command that caused the error occurred, the device was=20 > >>active or idle. > >> > >> After command completion occurred, registers were: > >> ER ST SC SN CL CH DH > >> -- -- -- -- -- -- -- > >> 04 51 00 00 4f c2 f0 > >> > >> Commands leading to the command that caused the error were: > >> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name > >> -- -- -- -- -- -- -- -- --------- -------------------- > >> b0 da 00 00 4f c2 f0 00 304662.313 SMART RETURN STATUS > >> ec 00 00 7c 9d bf f0 00 304662.313 IDENTIFY DEVICE > >> ec 00 01 00 00 00 f0 00 248808.313 IDENTIFY DEVICE > >> ec 00 01 00 00 00 f0 00 248808.313 IDENTIFY DEVICE > >> ec 00 01 00 00 00 f0 00 248808.250 IDENTIFY DEVICE > >> > >>Error 511 occurred at disk power-on lifetime: 0 hours > >> When the command that caused the error occurred, the device was=20 > >>active or idle. > >> > >> After command completion occurred, registers were: > >> ER ST SC SN CL CH DH > >> -- -- -- -- -- -- -- > >> 04 51 00 01 00 00 a0 > >> > >> Commands leading to the command that caused the error were: > >> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name > >> -- -- -- -- -- -- -- -- --------- -------------------- > >> b1 c0 00 01 00 00 a0 00 11.875 DEVICE CONFIGURATION RESTORE > >> ec 00 03 01 00 00 a0 00 11.875 IDENTIFY DEVICE > >> 91 00 3f 01 00 00 af 00 11.875 INITIALIZE DEVICE PARAMETERS [O= BS-6] > >> 10 00 00 01 00 00 a0 00 11.875 RECALIBRATE [OBS-4] > >> ec 00 01 01 00 00 a0 00 11.875 IDENTIFY DEVICE > >> > >>SMART Self-test log structure revision number 1 > >>Num Test_Description Status Remaining=20 > >>LifeTime(hours) LBA_of_first_error > >># 1 Extended offline Completed without error 00% 4245=20 > >> - > >> > >> > >> > >>------------------------------------------------------- > >>This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 > >>Project Admins to receive an Apple iPod Mini FREE for your judgement on > >>who ports your project to Linux PPC the best. Sponsored by IBM. > >>Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php > >>_______________________________________________ > >>Smartmontools-support mailing list > >>Sma...@li... > >>https://lists.sourceforge.net/lists/listinfo/smartmontools-support > >> > >> > >=20 > >=20 > >=20 > >=20 > > ------------------------------------------------------- > > This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 > > Project Admins to receive an Apple iPod Mini FREE for your judgement on > > who ports your project to Linux PPC the best. Sponsored by IBM. > > Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php > > _______________________________________________ > > Smartmontools-support mailing list > > Sma...@li... > > https://lists.sourceforge.net/lists/listinfo/smartmontools-support >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 > Project Admins to receive an Apple iPod Mini FREE for your judgement on > who ports your project to Linux PPC the best. Sponsored by IBM. > Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support >=20 >=20 |
From: <th...@bo...> - 2004-09-17 08:42:36
|
HI! Bruce Allen wrote: >>Isn't it alarming, that: >>- sometimes reading data takes up to 10 times longer >> (top shows no wild process running) >>- Raw_Read_Error_Rate has increased from 10000 to 70000 in 1 week >>- Soft_Read_Error_Rate has increased from 70 to 320 in 1 week > > Yes, this does suggest that something is failing. Definitely keep an eye > on these, and start running long self-tests on a regular basis. Or maybe buy a new drive before it fails. ;-) >>The drive is 7 months old and started this strange behaviour about 3 >>weeks ago. >> >>I am doing regular backups, so I would not loose much, if it would >>die. But those drastically increased read times every now and then >>are pretty annoying. > > > Is the drive temperature fairly stable? Yes. It varies between 39 and 43 degrees celsius. Regards, Thomas |
From: <th...@bo...> - 2004-09-20 12:26:14
|
HI! Bruce Allen wrote: >>Isn't it alarming, that: >>- sometimes reading data takes up to 10 times longer >> (top shows no wild process running) >>- Raw_Read_Error_Rate has increased from 10000 to 70000 in 1 week >>- Soft_Read_Error_Rate has increased from 70 to 320 in 1 week > > Yes, this does suggest that something is failing. Definitely keep an eye > on these, and start running long self-tests on a regular basis. Just wanted to add, that I bought a new identical drive and tried to copy the entire disk with dd. At the first try, it failed after 60 GB with DMA error... Then I switched from UDMA5 to UDMA2 and chose 64 KB as block size for dd. This worked and I got it copied completely (took 5 hours). After this, smartctl listed one unrecoverable sector on the old disk. Finally, I ran the complete test from Samsung's hutil.exe on it. And that test hung indefinitely at 73%. So, I am now certain that this disk is defective and will return it to Samsung. Thanks! Thomas |
From: Bruce A. <ba...@gr...> - 2004-09-20 13:55:27
|
Thanks for the follow-up. I'm glad you caught the problem and got your data off before the disk died completely. Cheers, =09Bruce On Mon, 20 Sep 2004, Thomas B=F6rkel wrote: > HI! >=20 > Bruce Allen wrote: >=20 > >>Isn't it alarming, that: > >>- sometimes reading data takes up to 10 times longer > >> (top shows no wild process running) > >>- Raw_Read_Error_Rate has increased from 10000 to 70000 in 1 week > >>- Soft_Read_Error_Rate has increased from 70 to 320 in 1 week > >=20 > > Yes, this does suggest that something is failing. Definitely keep an e= ye > > on these, and start running long self-tests on a regular basis. >=20 > Just wanted to add, that I bought a new identical drive and tried to=20 > copy the entire disk with dd. >=20 > At the first try, it failed after 60 GB with DMA error... Then I=20 > switched from UDMA5 to UDMA2 and chose 64 KB as block size for dd. This= =20 > worked and I got it copied completely (took 5 hours). >=20 > After this, smartctl listed one unrecoverable sector on the old disk. >=20 > Finally, I ran the complete test from Samsung's hutil.exe on it. And=20 > that test hung indefinitely at 73%. So, I am now certain that this disk= =20 > is defective and will return it to Samsung. >=20 > Thanks! >=20 > Thomas >=20 >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 > Project Admins to receive an Apple iPod Mini FREE for your judgement on > who ports your project to Linux PPC the best. Sponsored by IBM. > Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support >=20 >=20 |