From: <pu...@fr...> - 2009-10-08 12:17:51
|
First of all, my hard disk is a Western Digital Caviar Green S-ATA - 1000 Go - 32 Mo, I buy it a few month. Then this week I have installed smartmontools (ubuntu PC), I have run smartctl test long and short --> no error smartctl -l selftest /dev/sdb smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 186 - # 2 Extended offline Completed without error 00% 164 - # 3 Short offline Completed without error 00% 155 - # 4 Short offline Completed without error 00% 155 - But when I daemonised smart and monitor this hard drive I have an error, sent at every reboot. The error is : SMART error (CurrentPendingSector) detected on host: xxx This email was generated by the smartd daemon running on: host name: xxx DNS domain: [Unknown] NIS domain: (none) The following warning/error was logged by the smartd daemon: Device: /dev/sdb, 8 Currently unreadable (pending) sectors For details see host's SYSLOG (default: /var/log/syslog). You can also use the smartctl utility for further investigation. No additional email messages about this problem will be sent. In syslog I can see that : Device /dev/sdb: using '-d sat' for ATA disk behind SAT layer. Device: /dev/sdb, opened Device: /dev/sdb, not found in smartd database. Device: /dev/sdb, enabled SMART Attribute Autosave. Device: /dev/sdb, enabled SMART Automatic Offline Testing. Device: /dev/sdb, is SMART capable. Adding to "monitor" list. Monitoring 0 ATA and 2 SCSI devices Device: /dev/sdb, 8 Currently unreadable (pending) sectors Sending warning via /usr/share/smartmontools/smartd-runner to xxxx ... Warning via /usr/share/smartmontools/smartd-runner to xxxx: successful smartd has fork()ed into background mode. New PID=2764. file /var/run/smartd.pid written containing PID 2764 To finish here are the output of smartctl error and A : sudo smartctl -l error /dev/sdb smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Error Log Version: 1 Warning: ATA error count 10 inconsistent with error log pointer 1 ATA Error Count: 10 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 10 occurred at disk power-on lifetime: 82 hours (3 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 62 ec dd e0 Error: UNC at LBA = 0x00ddec62 = 14543970 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 62 ec dd 00 08 00:07:46.520 READ DMA c8 00 08 a2 ee ac 01 08 00:07:46.033 READ DMA ca 00 08 b2 26 e1 01 08 00:02:20.953 WRITE DMA ca 00 08 b2 fd f5 00 08 00:02:15.953 WRITE DMA ca 00 08 12 ac 85 01 08 00:02:00.972 WRITE DMA Error 9 occurred at disk power-on lifetime: 82 hours (3 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 dd 7d 3c e1 Error: UNC at LBA = 0x013c7ddd = 20741597 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 88 aa 7d 3c 01 08 00:01:33.755 READ DMA c8 00 08 2a 7c 3c 01 08 00:01:33.755 READ DMA c8 00 20 ca 7b 3c 01 08 00:01:33.755 READ DMA c8 00 10 02 7b 3c 01 08 00:01:33.733 READ DMA c8 00 20 e2 7a 3c 01 08 00:01:33.733 READ DMA Error 8 occurred at disk power-on lifetime: 82 hours (3 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 71 54 53 e1 Error: UNC at LBA = 0x01535471 = 22238321 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 08 6a 54 53 01 08 00:01:29.473 READ DMA c8 00 08 f2 ea 51 01 08 00:01:29.472 READ DMA c8 00 08 52 d0 52 01 08 00:01:29.463 READ DMA c8 00 08 ba ea 51 01 08 00:01:29.462 READ DMA c8 00 20 4a f2 80 00 08 00:01:29.449 READ DMA Error 7 occurred at disk power-on lifetime: 82 hours (3 days + 10 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 00 c9 23 26 e1 Error: UNC at LBA = 0x012623c9 = 19276745 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- c8 00 a0 32 23 26 01 08 00:01:19.982 READ DMA c8 00 38 f2 22 26 01 08 00:01:19.982 READ DMA c8 00 08 ea 22 26 01 08 00:01:19.982 READ DMA c8 00 68 ea 4b 23 01 08 00:01:19.782 READ DMA c8 00 20 ca ac 20 01 08 00:01:19.782 READ DMA sudo smartctl -A /dev/sdb smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 199 051 Pre-fail Always - 1 3 Spin_Up_Time 0x0027 159 157 021 Pre-fail Always - 7025 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 106 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 191 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 105 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 11 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 106 194 Temperature_Celsius 0x0022 127 104 000 Old_age Always - 23 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 8 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 Is my hard drive going to die ? How can we explain that the error is not seen with smartctl ? Can I ignore those errors ? Thanks |
From: Peter D. <gat...@ya...> - 2009-10-08 15:37:49
|
Hi, pu...@fr... wrote: > First of all, my hard disk is a Western Digital Caviar Green S-ATA - 1000 Go - > 32 Mo, I buy it a few month. > > Then this week I have installed smartmontools (ubuntu PC), I have run smartctl > test long and short --> no error [...] > But when I daemonised smart and monitor this hard drive I have an error, sent at > every reboot. > > The error is : > > SMART error (CurrentPendingSector) detected on host: xxx [...] > To finish here are the output of smartctl error and –A : > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - > 8 ... which is exactly the Problem that smartd reports. The successful self test only means, that at some point in time (in the 164th hour of the disk's "life") all data could be read. > Is my hard drive going to die ? > How can we explain that the error is not seen with smartctl ? > > Can I ignore those errors ? You probably don't have another reasonable choice. Your disk certainly won't last forever, but the "pending sector" count does not mean that your disk is going to die tomorrow. "Pending" means, that there have been problems with these sectors and the disk "is not yet sure" what to do with them. It could happen some day, that you need the data from these sectors and can't get them, but unfortunately, this could happen with any other sector, too ... The most comprehensive summary of smart attributes that I know of is at: http://en.wikipedia.org/wiki/S.M.A.R.T.#Known_ATA_S.M.A.R.T._attributes Regards, Peter |
From: Bruno W. I. <br...@wo...> - 2009-10-09 21:04:36
|
On Thu, Oct 08, 2009 at 17:37:02 +0200, Peter Daum <gat...@ya...> wrote: > > You probably don't have another reasonable choice. Your disk certainly won't > last forever, but the "pending sector" count does not mean that your disk > is going to die tomorrow. "Pending" means, that there have been problems with > these sectors and the disk "is not yet sure" what to do with them. It could > happen some day, that you need the data from these sectors and can't get them, > but unfortunately, this could happen with any other sector, too ... I don't believe that is a correct characterization of the pending counts. I believe those are sectors that are pending reallocation once the disk drive either gets a good read of those sectors or the sectors are rewritten. Pending sectors can cause problems and the OP may want to find them. |
From: Publicy <pu...@fr...> - 2009-10-10 07:28:50
|
Thanks for the answers But now what can/must I do ? Ignore those messages ? How ? Or better ignore those 8 sectors but not the message/attribute itself ? Can I mark them in order to continue to monitor these attribute but not to receive mail message concerning the 8 already marked ? Is it possible (I suppose no).... Bruno Wolff III a écrit : > On Thu, Oct 08, 2009 at 17:37:02 +0200, > Peter Daum <gat...@ya...> wrote: > >> You probably don't have another reasonable choice. Your disk certainly won't >> last forever, but the "pending sector" count does not mean that your disk >> is going to die tomorrow. "Pending" means, that there have been problems with >> these sectors and the disk "is not yet sure" what to do with them. It could >> happen some day, that you need the data from these sectors and can't get them, >> but unfortunately, this could happen with any other sector, too ... >> > > I don't believe that is a correct characterization of the pending counts. > I believe those are sectors that are pending reallocation once the disk > drive either gets a good read of those sectors or the sectors are rewritten. > Pending sectors can cause problems and the OP may want to find them. > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > Smartmontools-support mailing list > Sma...@li... > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > > |
From: Bruno W. I. <br...@wo...> - 2009-10-10 07:55:39
|
On Sat, Oct 10, 2009 at 09:01:54 +0200, Publicy <pu...@fr...> wrote: > Thanks for the answers > > But now what can/must I do ? That depends on your situation and there isn't a one size fits all answer. Try starting at the badblocks howto. http://smartmontools.sourceforge.net/badblockhowto.html > Ignore those messages ? How ? > > Or better ignore those 8 sectors but not the message/attribute itself ? > Can I mark them in order to continue to monitor these attribute but > not to receive mail message concerning the 8 already marked ? > > Is it possible (I suppose no).... |