From: Ciro S. <c.s...@ol...> - 2008-02-11 17:45:29
|
Hi all, I have a latitude x1 laptop with a 1,5" disk, it is a TOSHIBA MK3006GAL I recently experienced some lockups of 20-30secs, the hd light stays on and everything is very slow or unresponsive. Here is what I get in the messages log: [ 4151.094506] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 4151.094521] ata1.00: BMDMA stat 0x25 [ 4151.094539] ata1.00: cmd c8/00:00:fb:3b:9c/00:00:00:00:00/e1 tag 0 dma 131072 in [ 4151.094543] res 51/40:dc:1f:3c:9c/00:00:00:00:00/e1 Emask 0x9 (media error) [ 4151.094552] ata1.00: status: { DRDY ERR } [ 4151.094558] ata1.00: error: { UNC } [ 4151.100022] ata1.00: configured for UDMA/100 [ 4151.100044] ata1: EH complete So I thought of one or more bad sectors and run a log offline test with smartctl with the following result: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 10292 - others previous tests (long and short) reported no errors. But: 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 3 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail Offline - 0 As you can see Current_Pending_Sector is at 3...is that normal? how can I test the disk more throughtfully? Thanks, Ciro. |
From: Bruce A. <ba...@gr...> - 2008-02-11 19:35:35
|
Wow -- weird! You've done all the right things, and are right to be confused! Do Toshiba have a 'disk test/health' utility available for their disks? I have no idea what is going on. Perhaps another subscriber to this list will know. Cheers, Bruce On Mon, 11 Feb 2008, Ciro Scognamiglio wrote: > Hi all, > > I have a latitude x1 laptop with a 1,5" disk, it is a TOSHIBA MK3006GAL > I recently experienced some lockups of 20-30secs, the hd light stays on and > everything is very slow or unresponsive. Here is what I get in the messages > log: > > [ 4151.094506] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [ 4151.094521] ata1.00: BMDMA stat 0x25 > [ 4151.094539] ata1.00: cmd c8/00:00:fb:3b:9c/00:00:00:00:00/e1 tag 0 dma > 131072 in > [ 4151.094543] res 51/40:dc:1f:3c:9c/00:00:00:00:00/e1 Emask 0x9 > (media error) > [ 4151.094552] ata1.00: status: { DRDY ERR } > [ 4151.094558] ata1.00: error: { UNC } > [ 4151.100022] ata1.00: configured for UDMA/100 > [ 4151.100044] ata1: EH complete > > So I thought of one or more bad sectors and run a log offline test with > smartctl with the following result: > > Num Test_Description Status Remaining LifeTime(hours) > LBA_of_first_error > # 1 Extended offline Completed without error 00% 10292 - > > others previous tests (long and short) reported no errors. > But: > > 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age > Always - 0 > 197 Current_Pending_Sector 0x0032 100 100 000 Old_age > Always - 3 > 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age > Always - 0 > 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - > 0 > 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail > Always - 0 > 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail > Offline - 0 > > As you can see Current_Pending_Sector is at 3...is that normal? how can I test > the disk more throughtfully? > > Thanks, > > Ciro. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > |
From: Lennart H. <lah...@gm...> - 2008-02-11 19:55:08
|
It's normal for a disk with bad sectors, what you can do it force your disk to re-allocate them, I suggest you read and follow http://smartmontools.sourceforge.net/badblockhowto.html on how to do it. -Lennart On Feb 11, 2008 5:45 PM, Ciro Scognamiglio <c.s...@ol...> wrote: > Hi all, > > I have a latitude x1 laptop with a 1,5" disk, it is a TOSHIBA MK3006GAL > I recently experienced some lockups of 20-30secs, the hd light stays on and > everything is very slow or unresponsive. Here is what I get in the messages > log: > > [ 4151.094506] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 > [ 4151.094521] ata1.00: BMDMA stat 0x25 > [ 4151.094539] ata1.00: cmd c8/00:00:fb:3b:9c/00:00:00:00:00/e1 tag 0 dma > 131072 in > [ 4151.094543] res 51/40:dc:1f:3c:9c/00:00:00:00:00/e1 Emask 0x9 > (media error) > [ 4151.094552] ata1.00: status: { DRDY ERR } > [ 4151.094558] ata1.00: error: { UNC } > [ 4151.100022] ata1.00: configured for UDMA/100 > [ 4151.100044] ata1: EH complete > > So I thought of one or more bad sectors and run a log offline test with > smartctl with the following result: > > Num Test_Description Status Remaining LifeTime(hours) > LBA_of_first_error > # 1 Extended offline Completed without error 00% 10292 - > > others previous tests (long and short) reported no errors. > But: > > 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age > Always - 0 > 197 Current_Pending_Sector 0x0032 100 100 000 Old_age > Always - 3 > 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age > Always - 0 > 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - > 0 > 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail > Always - 0 > 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail > Offline - 0 > > As you can see Current_Pending_Sector is at 3...is that normal? how can I test > the disk more throughtfully? > > Thanks, > > Ciro. > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > -- Med Venlig Hilsen / Kind Regards Lennart Hansen |
From: Bruce A. <ba...@gr...> - 2008-02-11 20:16:35
|
Hi Lennart, The disk is not reporting any unreadable sectors -- the extended SMART self test below reads ALL sectors, and does not report any unreadable ones. Bruce On Mon, 11 Feb 2008, Lennart Hansen wrote: > It's normal for a disk with bad sectors, what you can do it force your > disk to re-allocate them, I suggest you read and follow > http://smartmontools.sourceforge.net/badblockhowto.html on how to do it. > > -Lennart > > On Feb 11, 2008 5:45 PM, Ciro Scognamiglio > <c.s...@ol...> wrote: >> Hi all, >> >> I have a latitude x1 laptop with a 1,5" disk, it is a TOSHIBA MK3006GAL >> I recently experienced some lockups of 20-30secs, the hd light stays on and >> everything is very slow or unresponsive. Here is what I get in the messages >> log: >> >> [ 4151.094506] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 >> [ 4151.094521] ata1.00: BMDMA stat 0x25 >> [ 4151.094539] ata1.00: cmd c8/00:00:fb:3b:9c/00:00:00:00:00/e1 tag 0 dma >> 131072 in >> [ 4151.094543] res 51/40:dc:1f:3c:9c/00:00:00:00:00/e1 Emask 0x9 >> (media error) >> [ 4151.094552] ata1.00: status: { DRDY ERR } >> [ 4151.094558] ata1.00: error: { UNC } >> [ 4151.100022] ata1.00: configured for UDMA/100 >> [ 4151.100044] ata1: EH complete >> >> So I thought of one or more bad sectors and run a log offline test with >> smartctl with the following result: >> >> Num Test_Description Status Remaining LifeTime(hours) >> LBA_of_first_error >> # 1 Extended offline Completed without error 00% 10292 - >> >> others previous tests (long and short) reported no errors. >> But: >> >> 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age >> Always - 0 >> 197 Current_Pending_Sector 0x0032 100 100 000 Old_age >> Always - 3 >> 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age >> Offline - 0 >> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age >> Always - 0 >> 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - >> 0 >> 7 Seek_Error_Rate 0x000b 100 100 050 Pre-fail >> Always - 0 >> 8 Seek_Time_Performance 0x0005 100 100 050 Pre-fail >> Offline - 0 >> >> As you can see Current_Pending_Sector is at 3...is that normal? how can I test >> the disk more throughtfully? >> >> Thanks, >> >> Ciro. >> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Smartmontools-support mailing list >> Sma...@li... >> https://lists.sourceforge.net/lists/listinfo/smartmontools-support >> > > > > |
From: Ciro S. <c.s...@ol...> - 2008-02-11 20:40:30
|
On Monday 11 February 2008 21:16:28 you wrote: > Hi Lennart, > > The disk is not reporting any unreadable sectors -- the extended SMART > self test below reads ALL sectors, and does not report any unreadable > ones. > Infact I followed the howto and cleaned up all three pending sectors. As you can see no sector has been reallocated, so I guess that was a filesystem error...even though the kernel messages let me suppose it was an hardware issue.... 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always - 0 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 |
From: Ciro S. <c.s...@ol...> - 2008-02-11 20:57:00
|
On Monday 11 February 2008 21:40:27 you wrote: > > Infact I followed the howto and cleaned up all three pending sectors. > As you can see no sector has been reallocated, so I guess that was a > filesystem error...even though the kernel messages let me suppose it was an > hardware issue.... > > 5 Reallocated_Sector_Ct 0x0033 100 100 050 Pre-fail Always > - 0 > 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age > Always - 0 > 197 Current_Pending_Sector 0x0032 100 100 000 Old_age > Always - 0 > 198 Offline_Uncorrectable 0x0030 100 100 000 Old_age > Offline - 0 > 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age > Always - 0 ok, I will answer to myself: The Current_Pending_Sector was > 0 AND error messages referred to some hw error, so that can't be an fs issue. Feb 11 21:06:16 joey kernel: [ 9408.586661] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed Feb 11 21:06:16 joey kernel: [ 9408.586677] end_request: I/O error, dev sda, sector 27016268 After having reallocated the sectors with dd I checked the corresponding file. Infact all three sectors had been allocated by ktorrent for an avi file. I checked the integrity of the file within ktorrent and effectively there was a chunk missing...re-downloaded it, and now everything seems to be fine... I am running again the long test...but I have a good feeling it wont find anything...I will search also for some toshiba tool... btw is that behavior normal? or maybe my disk does self heal? :) C/ |
From: Bruno W. I. <br...@wo...> - 2008-02-11 22:39:20
|
On Mon, Feb 11, 2008 at 21:40:27 +0100, Ciro Scognamiglio <c.s...@ol...> wrote: > > Infact I followed the howto and cleaned up all three pending sectors. > As you can see no sector has been reallocated, so I guess that was a > filesystem error...even though the kernel messages let me suppose it was an > hardware issue.... It was probably still a hardware error. The disk will not always reallocate sectors that it was unable to read. Sometimes they will still work when rewritten. |
From: Jim P. <ji...@jt...> - 2008-02-11 20:53:48
|
Ciro Scognamiglio wrote: > Infact I followed the howto and cleaned up all three pending sectors. > As you can see no sector has been reallocated, so I guess that was a > filesystem error...even though the kernel messages let me suppose it was an > hardware issue.... There is (or was) a hardware issue if the drive was returning a media error ("ata1.00: error: { UNC }"). The fact that the SMART response is giving conflicting information (a pending reallocation, but no errors, and now zero reallocations) is an even stronger indication of a confused/broken drive. I'd definitely run it through a stress test before trusting any important data to it. -jim |
From: Ciro S. <c.s...@ol...> - 2008-02-11 21:05:20
|
On Monday 11 February 2008 21:53:38 you wrote: > Ciro Scognamiglio wrote: > > Infact I followed the howto and cleaned up all three pending sectors. > > As you can see no sector has been reallocated, so I guess that was a > > filesystem error...even though the kernel messages let me suppose it was > > an hardware issue.... > > There is (or was) a hardware issue if the drive was returning a media > error ("ata1.00: error: { UNC }"). The fact that the SMART response > is giving conflicting information (a pending reallocation, but no > errors, and now zero reallocations) is an even stronger indication of > a confused/broken drive. I'd definitely run it through a stress test > before trusting any important data to it. > > -jim Actually smartctl is reporting something in the logs: Error 144 occurred at disk power-on lifetime: 10295 hours (428 days + 23 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 07 4c 3c 9c e1 Error: UNC 7 sectors at LBA = 0x019c3c4c = 27016268 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 4b 3c 9c e1 00 06:06:37.055 READ DMA 27 00 00 00 00 00 e0 00 06:06:37.055 READ NATIVE MAX ADDRESS EXT ec 00 00 00 00 00 a0 02 06:06:37.047 IDENTIFY DEVICE ef 03 45 00 00 00 a0 02 06:06:37.046 SET FEATURES [Set transfer mode] 27 00 00 00 00 00 e0 00 06:06:37.046 READ NATIVE MAX ADDRESS EXT how can I stress test the drive? C. |