From: Bruce A. <ba...@gr...> - 2003-08-05 19:45:46
|
Hi Jason, > I tested the new 3ware arguments today. They work great! Glad to hear it! > The machines I'm testing on are storage bricks, running RedHat 7.3, > kernel 2.4.20 and have 2 3ware 7500-12 ide raid controllers with > firmware 7.6. There are 24 Western Digital 200G drives attached to > these controllers. I did my testing on an old 6000 series card with two drives attached. I also used the latest RH 7.3 kernel. > I can receive smart data from all 12 drives attached to both of the > 3ware controllers. It looked like you hadn't added support to smartd > for this yet? I did that this morning -- it's in CVS now -- should arrive in the CVS pserver within a few hours. > Guess that is next on the list. The only thing I could not get was a > temperature reading, but I guess that is a function of smart on the > drives themselves, though it would be nice to get that data, its not a > show stopper. Yes, Western Digital has apparently stopped putting a temperature attribute into their SMART data. > I tested this with filesystems mounted and unmounted, seemed to work > ok both ways. All the info seems correct, including firmware and > serial numbers. That's good to hear. > What's next on the list to test? Ummm there's a few things: -- I'm not sure that the SMART status is being returned correctly (the thing reported by smartctl -H). I had to guess at the location of some data returned by the 3ware ioctl. It would be very nice to see a drive report failing SMART status at some point -- if this happens let me know. -- The 3w-xxxx driver makes it impossible to pass a couple of commands to the drive (enable auto offline and enable attribute autosave). I think I can fix the driver -- but would you be willing to test it? This could cause major data corruption if I get it wrong, which I am likely to do. -- Could you send details of the hardware you have tested? In particular the CPUs -- are they x86 or something more exotic -- For my curiousity, could you post the output of smartctl -a for one of your WD drives, as seen through the 3ware controller? Just send it as an email attachment. > Thanks for putting this in! You're welcome - I hope it works for other people too! Cheers, Bruce > On Mon, 4 Aug 2003, Bruce Allen wrote: > > > Jason, > > > > I spent some time over the weekend and today and added the ATA 3ware pass > > through into smartctl. Could you please test this? You'll need to get > > the code from the sourceforge CVS server -- follow instructions on > > smartmontools web page -- but wait about 24 hours to be sure that the CVS > > server is uptodate. Look on the smartctl man page and you'll find a new > > -d 3ware,N > > argument. > > > > Cheers, > > Bruce > > > > On Fri, 1 Aug 2003, Jason Holland wrote: > > > > > > > > Sure, I'll give it a shot. If I can't do it, oh well, maybe I can find > > > someone around here who can help. > > > > > > Jason P Holland > > > Texas Learning and Computation Center > > > http://www.tlc2.uh.edu > > > University of Houston > > > Philip G Hoffman Hall rm 207A > > > tel: 713-743-4850 > > > cell: 281-451-5991 > > > > > > On Fri, 1 Aug 2003, Bruce Allen wrote: > > > > > > > Jason, > > > > > > > > If you want to do this, I can dig out some code that I got from one of the > > > > 3ware developers. It sends some ATA commands to the disks behind the > > > > 3ware controller. Getting that to work is step 1. > > > > > > > > Should I dig it out and send it? > > > > > > > > Cheers, > > > > Bruce > > > > > > > > On Fri, 1 Aug 2003, Jason Holland wrote: > > > > > > > > > > > > > > Bruce, > > > > > I work mostly with grad students, who have no time to do the work they > > > > > need to, much less any extra stuff. The undergrads are not experienced > > > > > with linux enough. I don't mind giving it a shot, I just need a bit of > > > > > direction and possibly some help if i get stuck. :) > > > > > > > > > > Jason P Holland > > > > > Texas Learning and Computation Center > > > > > http://www.tlc2.uh.edu > > > > > University of Houston > > > > > Philip G Hoffman Hall rm 207A > > > > > tel: 713-743-4850 > > > > > cell: 281-451-5991 > > > > > > > > > > On Fri, 1 Aug 2003, Bruce Allen wrote: > > > > > > > > > > > Hi Jason, > > > > > > > > > > > > > I don't mind testing the code out, but it would take me a while to > > > > > > > complete the coding needed, so I'm not sure if I'm the right person > > > > > > > for that portion. But I would definately be interested in testing out > > > > > > > against my boxes, we could sure use this feature. > > > > > > > > > > > > OK, I'll keep you in mind for testing -- thank you for offering. > > > > > > > > > > > > As far as actually writing code - is there a student around who likes > > > > > > fooling around with linux? It's not a hard job -- I just don't have the > > > > > > time right now. > > > > > > > > > > > > Cheers, > > > > > > Bruce > > > > > > > > > > > > > > > > > > > > Jason P Holland > > > > > > > Texas Learning and Computation Center > > > > > > > http://www.tlc2.uh.edu > > > > > > > University of Houston > > > > > > > Philip G Hoffman Hall rm 207A > > > > > > > tel: 713-743-4850 > > > > > > > cell: 281-451-5991 > > > > > > > > > > > > > > On Fri, 1 Aug 2003, Bruce Allen wrote: > > > > > > > > > > > > > > > I apologize for not replying sooner to this email. I had hoped to reply > > > > > > > > "I've done it" but Real Life (TM) kept intruding. > > > > > > > > > > > > > > > > > >I also have the same questions regarding using smartmontools > > > > > > > > > >with IDE drives on 3ware cards running in SCSI mode. Is this > > > > > > > > > >a feature of the latest version (5.1.14) or are there special > > > > > > > > > >command line arguments to do this? > > > > > > > > > > > > > > > > > > AFAIK, no. Should be quite doable by someone with time and motivation to > > > > > > > > > dig into the 3ware driver, utils and documentation, though. Let us know if > > > > > > > > > you take a whack at it :) > > > > > > > > > > > > > > > > Erik is right. This was on my "to-do" list and still is. But I haven't > > > > > > > > had time to do it. If anyone is interested, I have a code sample which > > > > > > > > sends ATA commands to a disk inside a 3ware RAID controller, using an > > > > > > > > ioctl-pass-through feature of the linux driver for the card. > > > > > > > > > > > > > > > > When this is implemented, we'll simply use "-d 3ware" to indicate that the > > > > > > > > device type is a 3ware escalade controller. > > > > > > > > > > > > > > > > Volunteers should contact me. Only a single routine needs to be written, > > > > > > > > with (probably) around two hundred lines of code. Most of it is > > > > > > > > cut-and-paste, but some experimentation is needed. To do this, someone > > > > > > > > will need root access to a 3ware RAID controller with at least one or two > > > > > > > > disks attached, and no valuable data on the array. > > > > > > > > > > > > > > > > Cheers, > > > > > > > > Bruce > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > > > > This SF.Net email sponsored by: Free pre-built ASP.NET sites including > > > > > > > > Data Reports, E-commerce, Portals, and Forums are available now. > > > > > > > > Download today and enter to win an XBOX or Visual Studio .NET. > > > > > > > > http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 > > > > > > > > _______________________________________________ > > > > > > > > Smartmontools-support mailing list > > > > > > > > Sma...@li... > > > > > > > > https://lists.sourceforge.net/lists/listinfo/smartmontools-support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
From: Bruce A. <ba...@gr...> - 2003-08-06 04:59:16
|
Hi Jason, > > Yes, Western Digital has apparently stopped putting a temperature > > attribute into their SMART data. > Wonderful. Nice to know WD cares... Maxtor has also started doing this with some drives -- though it may just be a firmware bug since those drives still have Attribute 194 - it just reports 0 raw value. > > > I tested this with filesystems mounted and unmounted, seemed to work > > > ok both ways. All the info seems correct, including firmware and > > > serial numbers. > > > > That's good to hear. > > > > > What's next on the list to test? Could you please try the hort and long self-tests? smartctl -t short smartctl -t long Also, could you please try the conveyance self test? smartctl -t conveyance I have never seen a drive that implements this WARNING: do NOT use the -C (captive) option for any of these! After the test is complete, you'll see the results in the self-test log. Once again, if you could send the output of smartctl -a as an attachment I would appreciate that. > All my drives show PASSED, but I will keep an eye out for changes. Please let me know -- it will be reassuring. > > -- The 3w-xxxx driver makes it impossible to pass a couple of commands to > > the drive (enable auto offline and enable attribute autosave). I think I > > can fix the driver -- but would you be willing to test it? This could > > cause major data corruption if I get it wrong, which I am likely to do. > > > > Perhaps you could get help from the 3ware for this? I asked, but they are "too busy". But I'll add something to the FAQ explaining just what is wrong, so that people like yourself can ask them to fix it, and tell them precisely what is wrong. Perhaps if enough people say something... > I don't use the 3w-xxxx stock driver in the kernel, but the one > provided by 3ware with each firmware update. Its more up to date and > the one they recommend people use. If you use the stock driver and > try and get support, they complain about it and usually won't support > you, so I had to roll my own kernel with their updated drivers. Do they provide source code for their updated drivers? If so, please tell me how to get it. Cheers, Bruce |
From: Leonid A B. <le...@ma...> - 2003-08-06 06:03:02
|
On Tue, Aug 05, 2003 at 11:59:03PM -0500, Bruce Allen wrote: > Also, could you please try the conveyance self test? > smartctl -t conveyance > I have never seen a drive that implements this Speaking of the conveyance test, here is what I gor from the drive I keep in my DishPlayer PVR: === START OF INFORMATION SECTION === Device Model: WDC WD1200BB-53CAA1 Serial Number: WD-WMA8C2081285 Firmware Version: 17.07W17 Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 5 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Fri Jul 25 03:58:46 2003 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Off-line data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Off-line Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete off-line data collection: (4680) seconds. Offline data collection capabilities: (0x3b) SMART execute Offline immediate. Automatic timer ON/OFF support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. No Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 87) minutes. Extended self-test routine recommended polling time: ( 5) minutes. vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv Above is a typo or a bug in smartctl. Should have been Conveyance self-test routine recommended polling time: ( 5) minutes. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0007 090 090 021 Pre-fail Always - 6283 4 Start_Stop_Count 0x0032 099 099 040 Old_age Always - 1757 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 2982 10 Spin_Retry_Count 0x0013 100 100 051 Pre-fail Always - 0 11 Calibration_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 15 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0012 100 253 000 Old_age Always - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log, version number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short off-line Completed 00% 797 - # 2 Short captive Completed 00% 0 - # 3 Short captive Completed 00% 0 - If there is a demand, I can open up the satellite receiver and run a conveyance test, but it is quite a PITA. Leo |
From: Bruce A. <ba...@gr...> - 2003-08-06 09:12:00
|
Hi Leo, > > Also, could you please try the conveyance self test? > > smartctl -t conveyance > > I have never seen a drive that implements this > > Speaking of the conveyance test, here is what I gor from > the drive I keep in my DishPlayer PVR: > vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv > Above is a typo or a bug in smartctl. Should have been > Conveyance self-test routine > recommended polling time: ( 5) minutes. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Fixed in version 1.92 of ataprint.c on 2003/07/19 10:21:37 http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/smartmontools/sm5/ataprint.c.diff?r1=1.91&r2=1.92 > If there is a demand, I can open up the satellite receiver and run a conveyance > test, but it is quite a PITA. I don't think that's needed. Just upload the latest release (5.1-16, ten minutes old) and try: smartctl -t conveyance. Wait five minutes and print the self-test log. Cheers, Bruce |
From: Bruce A. <ba...@gr...> - 2003-08-06 09:17:16
|
> > If there is a demand, I can open up the satellite receiver and run a conveyance > > test, but it is quite a PITA. > > I don't think that's needed. Oops -- I just understood -- it's no longer in your linux box. Sorry! |
From: Bruce A. <ba...@gr...> - 2003-08-06 11:21:32
|
Hi Jason, I've issued a new 5.1-16 release -- it has a 3ware-aware smartd. Please give it a try and report back. > > > -- The 3w-xxxx driver makes it impossible to pass a couple of commands to > > > the drive (enable auto offline and enable attribute autosave). I think I > > > can fix the driver -- but would you be willing to test it? This could > > > cause major data corruption if I get it wrong, which I am likely to do. > > > > > > > Perhaps you could get help from the 3ware for this? > > I asked, but they are "too busy". But I'll add something to the FAQ > explaining just what is wrong, so that people like yourself can ask them > to fix it, and tell them precisely what is wrong. Perhaps if enough people > say something... I've added something about this to the FAQ section of the smartmontools home page. It gives a technical explanation of the problem and the fix. Perhaps you can get the 3ware people to pay attention. Cheers, Bruce |
From: Bruce A. <ba...@gr...> - 2003-10-30 21:17:48
|
> > All I can tell you for sure is: > > > > (1) I've done it lots of times. It's never caused trouble. > > (2) We've never gotten an angry email from a user saying "this broke > > my box." > > (3) If you are talking about a mission-critical system (eg, you > > will lose your job if it fails, somewill will be injured, etc) > > then please try it first on a test system. > > Good 'nuff for me, just knew that it was relatively new code and was > very scared because of my recent Deathstar problem, I've gone ahead > and enabled this, and all is working good. :) I'm pleased to hear that! > > > Also, I asked previously about what the best way is to get a device > > > into the smart database. You responded saying send in a smartctl -a > > > with some -v options tweaked so they are right... I haven't done that > > > because I was a little confused about how I can tweak the -v values so > > > that I know they are right... Is there a good method for doing this? > > > > We'll write a FAQ about this and put it on the web page, OK? > > That would be great, I've got 4 or 5 drives that aren't in the > database that I'd be happy to send the right information in to help > fill it out. It's on the smartmontools home page: the last item in the FAQ section. Cheers, Bruce |