From: Mark L. <mel...@gm...> - 2006-11-23 12:11:32
|
Hi, I've got a couple of Western Digital 500 Raid Edition drives in a mirrored RAID array on a 3ware 8006-2LP SATA controller. The drives replaced a couple of older 120gig Hitachi drives that have worked fine for years. I'm seeing some very strange error reports from smartd though in that the self test for both drives are reporting a read failure for the exact same LBA: # smartctl -a -d 3ware,0 /dev/twe0 ... # 1 Short offline Completed: read failure 10% 779 4292870155 # smartctl -a -d 3ware,1 /dev/twe0 # 1 Short offline Completed: read failure 10% 1725 4292870155 I've already replaced a previous disk on port #0 because the controller issues an ATA Port timeout error and kicked it out of the array. Yesterday, the controller reported an ATA Port timeout error on port #1 and kicked it out of the array. However, given that smartd is reporting the same error, at the same LBA for both drives, this seems like it's something other than the drives causing the problem? Any ideas? Thanks, Mark |
From: Sergey V. <vs...@al...> - 2006-11-24 20:59:29
|
On Thu, 23 Nov 2006 12:11:16 +0000 Mark Levitt wrote: > I've got a couple of Western Digital 500 Raid Edition drives in a > mirrored RAID array on a 3ware 8006-2LP SATA controller. > > The drives replaced a couple of older 120gig Hitachi drives that > have worked fine for years. > > I'm seeing some very strange error reports from smartd though in > that the self test for both drives are reporting a read failure for > the exact same LBA: > > # smartctl -a -d 3ware,0 /dev/twe0 > ... > # 1 Short offline Completed: read failure 10% 779 4292870155 This LBA is definitely bogus - it is 0xffe0000b in hex, and points to about 2 TiB. > # smartctl -a -d 3ware,1 /dev/twe0 > # 1 Short offline Completed: read failure 10% 1725 4292870155 > > > I've already replaced a previous disk on port #0 because the > controller issues an ATA Port timeout error and kicked it out of the > array. Yesterday, the controller reported an ATA Port timeout error > on port #1 and kicked it out of the array. > > However, given that smartd is reporting the same error, at the same > LBA for both drives, this seems like it's something other than the > drives causing the problem? Looks like either something is corrupting the selftest log while smartctl gets it from the drive (e.g., some problem with passthrough commands with the 3ware controller), or the drive itself puts bogus data in its selftest log due to broken drive firmware. If you can try to attach the same drive to a simple SATA controller and get "smartctl -a" output there, this should tell which part is broken. |
From: Bruce A. <ba...@gr...> - 2006-11-24 21:42:01
|
>> However, given that smartd is reporting the same error, at the same >> LBA for both drives, this seems like it's something other than the >> drives causing the problem? > > Looks like either something is corrupting the selftest log while > smartctl gets it from the drive (e.g., some problem with passthrough > commands with the 3ware controller), or the drive itself puts bogus > data in its selftest log due to broken drive firmware. If you can try > to attach the same drive to a simple SATA controller and get "smartctl > -a" output there, this should tell which part is broken. This would indeed be extremely useful in tracking down the problem. Cheers, Bruce |
From: Mark L. <mel...@gm...> - 2006-11-28 08:55:37
|
On Fri, Nov 24, 2006 at 03:41:53PM -0600, Bruce Allen wrote: > >data in its selftest log due to broken drive firmware. If you can try > >to attach the same drive to a simple SATA controller and get "smartctl > >-a" output there, this should tell which part is broken. > > This would indeed be extremely useful in tracking down the problem. > Hi, I'd like to try this, but the only SATA controllers I have are the 3ware card and an external USB to SATA enclosure. Would Smart see the drive through a USB connection? Thanks, Mark |
From: Bruce A. <ba...@gr...> - 2006-11-24 21:43:22
|
Mark: could you please provide the output of 'smartctl -V'? Cheers, Bruce On Thu, 23 Nov 2006, Mark Levitt wrote: > Hi, > > I've got a couple of Western Digital 500 Raid Edition drives in a mirrored RAID array on a 3ware 8006-2LP SATA controller. > > The drives replaced a couple of older 120gig Hitachi drives that have worked fine for years. > > I'm seeing some very strange error reports from smartd though in that the self test for both drives are reporting a read failure for the exact same LBA: > > # smartctl -a -d 3ware,0 /dev/twe0 > ... > # 1 Short offline Completed: read failure 10% 779 4292870155 > > > # smartctl -a -d 3ware,1 /dev/twe0 > # 1 Short offline Completed: read failure 10% 1725 4292870155 > > > I've already replaced a previous disk on port #0 because the controller issues an ATA Port timeout error and kicked it out of the array. Yesterday, the controller reported an ATA Port timeout error on port #1 and kicked it out of the array. > > However, given that smartd is reporting the same error, at the same LBA for both drives, this seems like it's something other than the drives causing the problem? > > Any ideas? > > Thanks, > Mark > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > |
From: Mark L. <mel...@gm...> - 2006-11-24 21:52:08
Attachments:
PGP.sig
|
Hi, OK, here it is: [root@blackbox ~]# smartctl -V -d 3ware,0 /dev/twe0 smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ smartctl comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under the terms of the GNU General Public License Version 2. See http://www.gnu.org for further details. CVS version IDs of files used to build this code are: Module: atacmdnames.c revision: 1.13 date: 2006/04/12 uses: atacmdnames.h revision: 1.5 date: 2006/04/12 Module: atacmds.c revision: 1.168 date: 2006/04/12 uses: atacmds.h revision: 1.81 date: 2006/04/12 uses: configure.in revision: 1.113 date: 2005/11/27 uses: extern.h revision: 1.41 date: 2006/04/12 uses: int64.h revision: 1.13 date: 2006/04/12 uses: utility.h revision: 1.43 date: 2006/04/12 Module: ataprint.c revision: 1.164 date: 2006/04/12 uses: atacmdnames.h revision: 1.5 date: 2006/04/12 uses: atacmds.h revision: 1.81 date: 2006/04/12 uses: ataprint.h revision: 1.28 date: 2006/04/12 uses: configure.in revision: 1.113 date: 2005/11/27 uses: extern.h revision: 1.41 date: 2006/04/12 uses: int64.h revision: 1.13 date: 2006/04/12 uses: knowndrives.h revision: 1.16 date: 2006/04/05 uses: smartctl.h revision: 1.23 date: 2006/04/12 uses: utility.h revision: 1.43 date: 2006/04/12 Module: knowndrives.c revision: 1.139 date: 2006/04/05 uses: atacmds.h revision: 1.81 date: 2006/04/12 uses: ataprint.h revision: 1.28 date: 2006/04/12 uses: configure.in revision: 1.113 date: 2005/11/27 uses: extern.h revision: 1.41 date: 2006/04/12 uses: int64.h revision: 1.13 date: 2006/04/12 uses: knowndrives.h revision: 1.16 date: 2006/04/05 uses: utility.h revision: 1.43 date: 2006/04/12 Module: os_linux.c revision: 1.82 date: 2006/04/12 uses: atacmds.h revision: 1.81 date: 2006/04/12 uses: configure.in revision: 1.113 date: 2005/11/27 uses: int64.h revision: 1.13 date: 2006/04/12 uses: os_linux.h revision: 1.24 date: 2006/04/12 uses: scsicmds.h revision: 1.57 date: 2006/04/12 uses: utility.h revision: 1.43 date: 2006/04/12 Module: scsicmds.c revision: 1.85 date: 2006/04/12 uses: configure.in revision: 1.113 date: 2005/11/27 uses: extern.h revision: 1.41 date: 2006/04/12 uses: int64.h revision: 1.13 date: 2006/04/12 uses: scsicmds.h revision: 1.57 date: 2006/04/12 uses: utility.h revision: 1.43 date: 2006/04/12 Module: scsiprint.c revision: 1.107 date: 2006/04/12 uses: configure.in revision: 1.113 date: 2005/11/27 uses: extern.h revision: 1.41 date: 2006/04/12 uses: int64.h revision: 1.13 date: 2006/04/12 uses: scsicmds.h revision: 1.57 date: 2006/04/12 uses: scsiprint.h revision: 1.20 date: 2006/04/12 uses: smartctl.h revision: 1.23 date: 2006/04/12 uses: utility.h revision: 1.43 date: 2006/04/12 Module: smartctl.c revision: 1.143 date: 2006/04/12 uses: atacmds.h revision: 1.81 date: 2006/04/12 uses: ataprint.h revision: 1.28 date: 2006/04/12 uses: configure.in revision: 1.113 date: 2005/11/27 uses: extern.h revision: 1.41 date: 2006/04/12 uses: int64.h revision: 1.13 date: 2006/04/12 uses: knowndrives.h revision: 1.16 date: 2006/04/05 uses: scsicmds.h revision: 1.57 date: 2006/04/12 uses: scsiprint.h revision: 1.20 date: 2006/04/12 uses: smartctl.h revision: 1.23 date: 2006/04/12 uses: utility.h revision: 1.43 date: 2006/04/12 Module: utility.c revision: 1.61 date: 2006/04/12 uses: configure.in revision: 1.113 date: 2005/11/27 uses: int64.h revision: 1.13 date: 2006/04/12 uses: utility.h revision: 1.43 date: 2006/04/12 smartmontools release 5.36 dated 2006/04/12 at 17:39:01 UTC smartmontools build host: i686-redhat-linux-gnu smartmontools build configured: 2006/06/08 13:57:16 UTC smartctl compile dated Jun 8 2006 at 09:57:24 smartmontools configure arguments: '--build=i686-redhat-linux-gnu' '-- host=i686-redhat-linux-gnu' '--target=i386-redhat-linux-gnu' '-- program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/ bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/ libexec' '--localstatedir=/var' '--sharedstatedir=/usr/com' '-- mandir=/usr/share/man' '--infodir=/usr/share/info' 'CFLAGS=-O2 -g - pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector -- param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic - fasynchronous-unwind-tables' 'build_alias=i686-redhat-linux-gnu' 'host_alias=i686-redhat-linux-gnu' 'target_alias=i386-redhat-linux-gnu' Mark Levitt mel...@gm... AIM/Yahoo/Skype id: melevittfl blog: http://www.marklectic.com On 24 Nov 2006, at 21:43, Bruce Allen wrote: > Mark: could you please provide the output of 'smartctl -V'? > > Cheers, > Bruce > > > On Thu, 23 Nov 2006, Mark Levitt wrote: > >> Hi, >> >> I've got a couple of Western Digital 500 Raid Edition drives in a >> mirrored RAID array on a 3ware 8006-2LP SATA controller. >> >> The drives replaced a couple of older 120gig Hitachi drives that >> have worked fine for years. >> >> I'm seeing some very strange error reports from smartd though in >> that the self test for both drives are reporting a read failure >> for the exact same LBA: >> >> # smartctl -a -d 3ware,0 /dev/twe0 >> ... >> # 1 Short offline Completed: read failure 10% >> 779 4292870155 >> >> >> # smartctl -a -d 3ware,1 /dev/twe0 >> # 1 Short offline Completed: read failure 10% >> 1725 4292870155 >> >> >> I've already replaced a previous disk on port #0 because the >> controller issues an ATA Port timeout error and kicked it out of >> the array. Yesterday, the controller reported an ATA Port timeout >> error on port #1 and kicked it out of the array. >> >> However, given that smartd is reporting the same error, at the >> same LBA for both drives, this seems like it's something other >> than the drives causing the problem? >> >> Any ideas? >> >> Thanks, >> Mark >> >> >> --------------------------------------------------------------------- >> ---- >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to >> share your >> opinions on IT & business topics through brief surveys - and earn >> cash >> >> http://www.techsay.com/default.php? >> page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ >> Smartmontools-support mailing list >> Sma...@li... >> https://lists.sourceforge.net/lists/listinfo/smartmontools-support >> > |
From: Bruce A. <ba...@gr...> - 2006-11-28 14:06:21
|
>>> data in its selftest log due to broken drive firmware. If you can try >>> to attach the same drive to a simple SATA controller and get "smartctl >>> -a" output there, this should tell which part is broken. >> This would indeed be extremely useful in tracking down the problem. > I'd like to try this, but the only SATA controllers I have are the 3ware card and an external USB to SATA enclosure. > > Would Smart see the drive through a USB connection? No, the USB <--> SATA enclosures don't pass SMART commands through. Another solution: for about $20 you can buy a PCMCIA SATA adaptor card for a laptop. PCI cards are about the same price: http://www.nextag.com/pcmcia-sata/search-html Cheers, Bruce |
From: Mark L. <mel...@gm...> - 2006-11-28 17:01:35
|
On Tue, Nov 28, 2006 at 08:06:10AM -0600, Bruce Allen wrote: > No, the USB <--> SATA enclosures don't pass SMART commands through. > Another solution: for about $20 you can buy a PCMCIA SATA adaptor card for OK, I've bought a 2 port SATA PCI card. I'll try connecting one of the drives to it and I'll let you know what smartctl says about it. Thanks for your help, Mark |
From: Bruce A. <ba...@gr...> - 2006-11-28 17:04:30
|
Thanks -- a nice 'contribution' to open-source development! On Tue, 28 Nov 2006, Mark Levitt wrote: > On Tue, Nov 28, 2006 at 08:06:10AM -0600, Bruce Allen wrote: >> No, the USB <--> SATA enclosures don't pass SMART commands through. >> Another solution: for about $20 you can buy a PCMCIA SATA adaptor card for > > OK, I've bought a 2 port SATA PCI card. I'll try connecting one of the drives to it and I'll let you know what smartctl says about it. > > Thanks for your help, > Mark > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > |
From: Mark L. <mel...@gm...> - 2006-12-03 17:48:31
|
> On Tue, 28 Nov 2006, Mark Levitt wrote: > >OK, I've bought a 2 port SATA PCI card. I'll try connecting one of the > >drives to it and I'll let you know what smartctl says about it. OK, I installed a Belkin SATA PCI card and connected one of the drives to it. The card seems to be identified as "sata_sil" (Silicon Image, I guess). It seems to show the same error, even if connected to a different SATA controller: Here's the output of smartcrl -a -d ata /dev/sdb: smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD5000YS-01MPB0 Serial Number: WD-WMANU1514231 Firmware Version: 07.02E07 User Capacity: 500,107,862,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sun Dec 3 17:22:52 2006 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x85) Offline data collection activity was aborted by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (14400) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 179) minutes. Conveyance self-test routine recommended polling time: ( 6) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 223 221 021 Pre-fail Always - 5841 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 20 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1978 10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 20 194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 38 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 1962 - # 2 Short offline Completed without error 00% 1938 - # 3 Short offline Completed without error 00% 1914 - # 4 Short offline Completed without error 00% 1890 - # 5 Short offline Completed without error 00% 1866 - # 6 Short offline Completed without error 00% 1842 - # 7 Short offline Completed without error 00% 1818 - # 8 Short offline Completed without error 00% 1797 - # 9 Short offline Completed without error 00% 1773 - #10 Short offline Completed without error 00% 1749 - #11 Short offline Completed: read failure 10% 1725 4292870155 #12 Short offline Completed without error 00% 1717 - #13 Short offline Completed without error 00% 1693 - #14 Short offline Completed without error 00% 1669 - #15 Short offline Completed without error 00% 1645 - #16 Short offline Completed without error 00% 1621 - #17 Short offline Completed without error 00% 1597 - #18 Short offline Completed without error 00% 1573 - #19 Short offline Completed without error 00% 1549 - #20 Short offline Completed without error 00% 1525 - #21 Short offline Completed without error 00% 1501 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: WDC WD5000YS-01MPB0 Serial Number: WD-WMANU1514231 Firmware Version: 07.02E07 User Capacity: 500,107,862,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 7 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sun Dec 3 17:23:37 2006 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 223 221 021 Pre-fail Always - 5841 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 20 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1978 10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 20 194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 38 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0 |
From: Bruce A. <ba...@gr...> - 2006-12-04 09:19:28
|
Mark, thanks. This is (again) showing an error 2197 GB into a 500 GB disk. Not really possible, as Sergey has pointed out. Sergey, the actual LBA reported is 4292870155 = (hex) FFE0000B. I wonder if this is a code of some type, not refering to an actual LBA value. Mark, could you please run and extended self-test (-t long), wait for it to complete, then send the results? Cheers, Bruce On Sun, 3 Dec 2006, Mark Levitt wrote: >> On Tue, 28 Nov 2006, Mark Levitt wrote: >>> OK, I've bought a 2 port SATA PCI card. I'll try connecting one of the >>> drives to it and I'll let you know what smartctl says about it. > > OK, > > I installed a Belkin SATA PCI card and connected one of the drives to it. The card seems to be identified as "sata_sil" (Silicon Image, I guess). > > It seems to show the same error, even if connected to a different SATA controller: > > Here's the output of smartcrl -a -d ata /dev/sdb: > > smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF INFORMATION SECTION === > Device Model: WDC WD5000YS-01MPB0 > Serial Number: WD-WMANU1514231 > Firmware Version: 07.02E07 > User Capacity: 500,107,862,016 bytes > Device is: Not in smartctl database [for details use: -P showall] > ATA Version is: 7 > ATA Standard is: Exact ATA specification draft version not indicated > Local Time is: Sun Dec 3 17:22:52 2006 GMT > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > === START OF READ SMART DATA SECTION === > SMART overall-health self-assessment test result: PASSED > > General SMART Values: > Offline data collection status: (0x85) Offline data collection activity > was aborted by an interrupting command from host. > Auto Offline Data Collection: Enabled. > Self-test execution status: ( 0) The previous self-test routine completed > without error or no self-test has ever > been run. > Total time to complete Offline > data collection: (14400) seconds. > Offline data collection > capabilities: (0x7b) SMART execute Offline immediate. > Auto Offline data collection on/off support. > Suspend Offline collection upon new > command. > Offline surface scan supported. > Self-test supported. > Conveyance Self-test supported. > Selective Self-test supported. > SMART capabilities: (0x0003) Saves SMART data before entering > power-saving mode. > Supports SMART auto save timer. > Error logging capability: (0x01) Error logging supported. > General Purpose Logging supported. > Short self-test routine > recommended polling time: ( 2) minutes. > Extended self-test routine > recommended polling time: ( 179) minutes. > Conveyance self-test routine > recommended polling time: ( 6) minutes. > > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 > 3 Spin_Up_Time 0x0003 223 221 021 Pre-fail Always - 5841 > 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 20 > 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 > 7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 > 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1978 > 10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0 > 11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 20 > 194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 38 > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 > 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 > 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 > 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 > 200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0 > > SMART Error Log Version: 1 > No Errors Logged > > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error > # 1 Short offline Completed without error 00% 1962 - > # 2 Short offline Completed without error 00% 1938 - > # 3 Short offline Completed without error 00% 1914 - > # 4 Short offline Completed without error 00% 1890 - > # 5 Short offline Completed without error 00% 1866 - > # 6 Short offline Completed without error 00% 1842 - > # 7 Short offline Completed without error 00% 1818 - > # 8 Short offline Completed without error 00% 1797 - > # 9 Short offline Completed without error 00% 1773 - > #10 Short offline Completed without error 00% 1749 - > #11 Short offline Completed: read failure 10% 1725 4292870155 > #12 Short offline Completed without error 00% 1717 - > #13 Short offline Completed without error 00% 1693 - > #14 Short offline Completed without error 00% 1669 - > #15 Short offline Completed without error 00% 1645 - > #16 Short offline Completed without error 00% 1621 - > #17 Short offline Completed without error 00% 1597 - > #18 Short offline Completed without error 00% 1573 - > #19 Short offline Completed without error 00% 1549 - > #20 Short offline Completed without error 00% 1525 - > #21 Short offline Completed without error 00% 1501 - > > SMART Selective self-test log data structure revision number 1 > SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS > 1 0 0 Not_testing > 2 0 0 Not_testing > 3 0 0 Not_testing > 4 0 0 Not_testing > 5 0 0 Not_testing > Selective self-test flags (0x0): > After scanning selected spans, do NOT read-scan remainder of disk. > If Selective self-test is pending on power-up, resume after 0 minute delay. > > smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF INFORMATION SECTION === > Device Model: WDC WD5000YS-01MPB0 > Serial Number: WD-WMANU1514231 > Firmware Version: 07.02E07 > User Capacity: 500,107,862,016 bytes > Device is: Not in smartctl database [for details use: -P showall] > ATA Version is: 7 > ATA Standard is: Exact ATA specification draft version not indicated > Local Time is: Sun Dec 3 17:23:37 2006 GMT > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF READ SMART DATA SECTION === > SMART Attributes Data Structure revision number: 16 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 > 3 Spin_Up_Time 0x0003 223 221 021 Pre-fail Always - 5841 > 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 20 > 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 > 7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 > 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1978 > 10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0 > 11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 20 > 194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 38 > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 > 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 > 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 > 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 > 200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0 > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > |
From: Bruce A. <ba...@gr...> - 2006-12-05 04:22:09
|
Hi Mark, My contact at WD would like to get your drive back so that they can investigate. I'm sure they'll send you a shiny new one in exchange. Send me a note off-list if this is OK, and I'll put you in touch. Cheers, Bruce On Mon, 4 Dec 2006, Bruce Allen wrote: > Mark, thanks. This is (again) showing an error 2197 GB into a 500 GB > disk. Not really possible, as Sergey has pointed out. > > Sergey, the actual LBA reported is 4292870155 = (hex) FFE0000B. I wonder > if this is a code of some type, not refering to an actual LBA value. > > Mark, could you please run and extended self-test (-t long), wait for it > to complete, then send the results? > > Cheers, > Bruce > > > On Sun, 3 Dec 2006, Mark Levitt wrote: > >>> On Tue, 28 Nov 2006, Mark Levitt wrote: >>>> OK, I've bought a 2 port SATA PCI card. I'll try connecting one of the >>>> drives to it and I'll let you know what smartctl says about it. >> >> OK, >> >> I installed a Belkin SATA PCI card and connected one of the drives to it. The card seems to be identified as "sata_sil" (Silicon Image, I guess). >> >> It seems to show the same error, even if connected to a different SATA controller: >> >> Here's the output of smartcrl -a -d ata /dev/sdb: >> >> smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen >> Home page is http://smartmontools.sourceforge.net/ >> >> === START OF INFORMATION SECTION === >> Device Model: WDC WD5000YS-01MPB0 >> Serial Number: WD-WMANU1514231 >> Firmware Version: 07.02E07 >> User Capacity: 500,107,862,016 bytes >> Device is: Not in smartctl database [for details use: -P showall] >> ATA Version is: 7 >> ATA Standard is: Exact ATA specification draft version not indicated >> Local Time is: Sun Dec 3 17:22:52 2006 GMT >> SMART support is: Available - device has SMART capability. >> SMART support is: Enabled >> >> === START OF READ SMART DATA SECTION === >> SMART overall-health self-assessment test result: PASSED >> >> General SMART Values: >> Offline data collection status: (0x85) Offline data collection activity >> was aborted by an interrupting command from host. >> Auto Offline Data Collection: Enabled. >> Self-test execution status: ( 0) The previous self-test routine completed >> without error or no self-test has ever >> been run. >> Total time to complete Offline >> data collection: (14400) seconds. >> Offline data collection >> capabilities: (0x7b) SMART execute Offline immediate. >> Auto Offline data collection on/off support. >> Suspend Offline collection upon new >> command. >> Offline surface scan supported. >> Self-test supported. >> Conveyance Self-test supported. >> Selective Self-test supported. >> SMART capabilities: (0x0003) Saves SMART data before entering >> power-saving mode. >> Supports SMART auto save timer. >> Error logging capability: (0x01) Error logging supported. >> General Purpose Logging supported. >> Short self-test routine >> recommended polling time: ( 2) minutes. >> Extended self-test routine >> recommended polling time: ( 179) minutes. >> Conveyance self-test routine >> recommended polling time: ( 6) minutes. >> >> SMART Attributes Data Structure revision number: 16 >> Vendor Specific SMART Attributes with Thresholds: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 >> 3 Spin_Up_Time 0x0003 223 221 021 Pre-fail Always - 5841 >> 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 20 >> 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 >> 7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 >> 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1978 >> 10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0 >> 11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0 >> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 20 >> 194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 38 >> 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 >> 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 >> 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 >> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 >> 200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0 >> >> SMART Error Log Version: 1 >> No Errors Logged >> >> SMART Self-test log structure revision number 1 >> Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error >> # 1 Short offline Completed without error 00% 1962 - >> # 2 Short offline Completed without error 00% 1938 - >> # 3 Short offline Completed without error 00% 1914 - >> # 4 Short offline Completed without error 00% 1890 - >> # 5 Short offline Completed without error 00% 1866 - >> # 6 Short offline Completed without error 00% 1842 - >> # 7 Short offline Completed without error 00% 1818 - >> # 8 Short offline Completed without error 00% 1797 - >> # 9 Short offline Completed without error 00% 1773 - >> #10 Short offline Completed without error 00% 1749 - >> #11 Short offline Completed: read failure 10% 1725 4292870155 >> #12 Short offline Completed without error 00% 1717 - >> #13 Short offline Completed without error 00% 1693 - >> #14 Short offline Completed without error 00% 1669 - >> #15 Short offline Completed without error 00% 1645 - >> #16 Short offline Completed without error 00% 1621 - >> #17 Short offline Completed without error 00% 1597 - >> #18 Short offline Completed without error 00% 1573 - >> #19 Short offline Completed without error 00% 1549 - >> #20 Short offline Completed without error 00% 1525 - >> #21 Short offline Completed without error 00% 1501 - >> >> SMART Selective self-test log data structure revision number 1 >> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS >> 1 0 0 Not_testing >> 2 0 0 Not_testing >> 3 0 0 Not_testing >> 4 0 0 Not_testing >> 5 0 0 Not_testing >> Selective self-test flags (0x0): >> After scanning selected spans, do NOT read-scan remainder of disk. >> If Selective self-test is pending on power-up, resume after 0 minute delay. >> >> smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen >> Home page is http://smartmontools.sourceforge.net/ >> >> === START OF INFORMATION SECTION === >> Device Model: WDC WD5000YS-01MPB0 >> Serial Number: WD-WMANU1514231 >> Firmware Version: 07.02E07 >> User Capacity: 500,107,862,016 bytes >> Device is: Not in smartctl database [for details use: -P showall] >> ATA Version is: 7 >> ATA Standard is: Exact ATA specification draft version not indicated >> Local Time is: Sun Dec 3 17:23:37 2006 GMT >> SMART support is: Available - device has SMART capability. >> SMART support is: Enabled >> >> smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen >> Home page is http://smartmontools.sourceforge.net/ >> >> === START OF READ SMART DATA SECTION === >> SMART Attributes Data Structure revision number: 16 >> Vendor Specific SMART Attributes with Thresholds: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 >> 3 Spin_Up_Time 0x0003 223 221 021 Pre-fail Always - 5841 >> 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 20 >> 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 >> 7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 >> 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1978 >> 10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0 >> 11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0 >> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 20 >> 194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 38 >> 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 >> 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 >> 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 >> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 >> 200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0 >> >> >> >> ------------------------------------------------------------------------- >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to share your >> opinions on IT & business topics through brief surveys - and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ >> Smartmontools-support mailing list >> Sma...@li... >> https://lists.sourceforge.net/lists/listinfo/smartmontools-support >> > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > |