|
From: Justin P. <jp...@lu...> - 2006-07-13 15:51:45
|
Did your drive also pass the smartd self-assessment test, i.e., give no LBAs of problems? Yes sir. Get all the data off the drive, unmount it. Run this for a few days: /usr/bin/time badblocks -b 512 -s -v -w /dev/hdg Then it should make the drive entirely **** the bed. On Thu, 13 Jul 2006, Jim Truitt wrote: > Hi Justin, thanks for the advice - that was my gut reaction as well. > Unfortunately, in order to RMA the drive, it has to fail the Samsung HUTIL > drive check. I ran all of the hard drive tests (mulitple times) and it > found no problems. Did your drive also pass the smartd self-assessment > test, i.e., give no LBAs of problems? > > -Jim > > On 7/13/06, Justin Piszcz <jp...@lu...> wrote: >> >> Your drive is about to die, RMA it ASAP. I had the same thing, mine died >> 24 hours later. >> >> Justin. >> >> On Wed, 12 Jul 2006, Jim Truitt wrote: >> >> > Every half hour I get a notification from smartd that my hard drive has >> 5 >> > offline uncorrectable sectors. >> > >> > $ tail /var/log/messages >> > Jul 11 10:11:07 localhost smartd[3358]: Device: /dev/hda, 5 Offline >> > uncorrectable sectors >> > ul 11 10:41:07 localhost smartd[3358]: Device: /dev/hda, 5 Offline >> > uncorrectable sectors >> > Jul 11 11:11:07 localhost smartd[3358]: Device: /dev/hda, 5 Offline >> > uncorrectable sectors >> > Jul 11 11:41:07 localhost smartd[3358]: Device: /dev/hda, 5 Offline >> > uncorrectable sectors >> > >> > Fearing that the drive was failing, I ran the drive through Samsung's >> HUTIL >> > surface scan utility (multiple times) and found no errors. >> > >> > Then I ran the smartctl offline long test, and though the test reports >> > "PASSED", I noticed that (in addition to the Offline uncorrectable >> sectors) >> > there is a reallocated event count but not a reallocated sector count. >> > >> > # smartctl -a /dev/hda >> > smartctl version 5.33 [i386-redhat-linux-gnu] Copyright (C) 2002-4 Bruce >> > Allen >> > Home page is http://smartmontools.sourceforge.net/ >> > >> > === START OF INFORMATION SECTION === >> > Device Model: SAMSUNG SP1614N >> > Serial Number: S016J10XA43633 >> > Firmware Version: TM100-24 >> > User Capacity: 160,041,885,696 bytes >> > Device is: In smartctl database [for details use: -P show] >> > ATA Version is: 7 >> > ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 >> > Local Time is: Tue Jul 11 12:03:42 2006 EDT >> > SMART support is: Available - device has SMART capability. >> > SMART support is: Enabled >> > >> > === START OF READ SMART DATA SECTION === >> > SMART overall-health self-assessment test result: PASSED >> > >> > General SMART Values: >> > Offline data collection status: (0x02) Offline data collection activity >> > was completed without error. >> > Auto Offline Data Collection: >> > Disabled. >> > Self-test execution status: ( 0) The previous self-test routine >> > completed >> > without error or no self-test has >> > ever >> > been run. >> > Total time to complete Offline >> > data collection: (5760) seconds. >> > Offline data collection >> > capabilities: (0x1b) SMART execute Offline immediate. >> > Auto Offline data collection >> on/off >> > support. >> > Suspend Offline collection upon >> new >> > command. >> > Offline surface scan supported. >> > Self-test supported. >> > No Conveyance Self-test >> supported. >> > No Selective Self-test supported. >> > SMART capabilities: (0x0003) Saves SMART data before entering >> > power-saving mode. >> > Supports SMART auto save timer. >> > Error logging capability: (0x01) Error logging supported. >> > No General Purpose Logging >> support. >> > Short self-test routine >> > recommended polling time: ( 1) minutes. >> > Extended self-test routine >> > recommended polling time: ( 96) minutes. >> > >> > SMART Attributes Data Structure revision number: 16 >> > Vendor Specific SMART Attributes with Thresholds: >> > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH >> > TYPE UPDATED WHEN_FAILED RAW_VALUE >> > 1 Raw_Read_Error_Rate 0x000b 100 100 >> > 051 Pre-fail Always - 954 >> > 3 Spin_Up_Time 0x0007 067 050 >> > 000 Pre-fail Always - 5760 >> > 4 Start_Stop_Count 0x0032 100 100 000 Old_age >> > Always - 76 >> > 5 Reallocated_Sector_Ct 0x0033 253 253 >> > 010 Pre-fail Always - 0 >> > 7 Seek_Error_Rate 0x000b 253 253 >> > 051 Pre-fail Always - 0 >> > 8 Seek_Time_Performance 0x0024 083 083 000 Old_age >> > Offline - 12764 >> > 9 Power_On_Half_Minutes 0x0032 098 098 000 Old_age >> > Always - 11868h+16m >> > 10 Spin_Retry_Count 0x0013 253 253 >> > 049 Pre-fail Always - 0 >> > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age >> > Always - 42 >> > 194 Temperature_Celsius 0x0022 139 109 000 Old_age >> > Always - 33 >> > 195 Hardware_ECC_Recovered 0x000a 100 100 000 Old_age >> > Always - 940776482 >> > 196 Reallocated_Event_Count 0x0012 098 098 000 Old_age >> > Always - 5 >> > 197 Current_Pending_Sector 0x0033 253 253 >> > 010 Pre-fail Always - 0 >> > 198 Offline_Uncorrectable 0x0031 098 098 >> > 010 Pre-fail Offline - 5 >> > 199 UDMA_CRC_Error_Count 0x000b 100 100 >> > 051 Pre-fail Always - 0 >> > 200 Multi_Zone_Error_Rate 0x000b 100 100 >> > 051 Pre-fail Always - 0 >> > 201 Soft_Read_Error_Rate 0x000b 100 100 >> > 051 Pre-fail Always - 0 >> > >> > SMART Error Log Version: 1 >> > No Errors Logged >> > >> > SMART Self-test log structure revision number 1 >> > Num Test_Description Status Remaining >> > LifeTime(hours) LBA_of_first_error >> > # 1 Extended offline Completed without error 00% 11580 >> > - >> > >> > Device does not support Selective Self Tests/Logging >> > >> > So my question is what do I do to clear out the offline uncorrectable >> > sectors, or are there actually problematic sectors? Without a failed >> test >> > and LBA of the sector I'm sort of at a loss as to figuring out what's >> going >> > on. >> > >> > Any suggestions would be appreciated! >> > >> > Thanks, >> > -Jim >> > >> >> > |