From: Bruce A. <ba...@gr...> - 2008-09-23 06:37:09
|
Hi Jordan, Thanks very much for doing this -- we have had many many requests for MegaRAID support over the past years! I'd like to add you to the list of smartmontools developers, so that you can merge your code into the current CVS HEAD. Would this be OK? You'll have to send me your SourceForge username. Smartd support should be rather easy to add (and as Christian Franke moves our code base towards a new design, it should eventually come for free!). Cheers, Bruce On Mon, 22 Sep 2008, Jordan_Hargrave@Dell.com wrote: > I've put together a first pass at adding support for monitoring MegaRAID controllers with smartctl. > > Usage is: smartctl -d megaraid,0 /dev/sda > -d megaraid,N specifies looking at disk N behind the megaraid controller that owns /dev/sda. > > This patch should work on newer Dell PERC5/6 controllers using the megaraid_sas driver as well as > older PERC3/4 controllers (channel 0 only) that use the megaraid_mbox driver. > > I also haven't put in support to smartd yet for this. > > Here is output from a PERC2/DC: > ====================================================================================== > smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > Device: SEAGATE ST3146807LC Version: DS04 > Serial number: XXXXXXXXXX > Device type: disk > Transport protocol: Parallel SCSI (SPI-4) > Local Time is: Mon Sep 22 12:00:21 2008 CDT > Device supports SMART and is Enabled > Temperature Warning Disabled or Not Supported > SMART Health Status: OK > > Current Drive Temperature: 30 C > Drive Trip Temperature: 68 C > Elements in grown defect list: 0 > Vendor (Seagate) cache information > Blocks sent to initiator = 3594638780 > Blocks received from initiator = 2468925686 > Blocks read from cache and sent to initiator = 2560073511 > Number of read and write commands whose size <= segment size = 402835819 > Number of read and write commands whose size > segment size = 4476844 > Vendor (Seagate/Hitachi) factory information > number of hours powered up = 38097.77 > number of minutes until next internal SMART test = 48 > > Error counter log: > Errors Corrected by Total Correction Gigabytes Total > ECC rereads/ errors algorithm processed uncorrected > fast | delayed rewrites corrected invocations [10^9 bytes] errors > read: 39651012 0 0 39651012 39651012 74395.245 0 > write: 0 0 0 0 0 6545.957 0 > > Non-medium error count: 4570 > > [GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on'] > > SMART Self-test log > Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] > Description number (hours) > # 1 Background short Completed - 2 - [- - -] > # 2 Background short Completed - 2 - [- - -] > > Long (extended) Self Test duration: 3072 seconds [51.2 minutes] > > Here is output from a PERC5i: > ====================================================================================== > smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > Device: SEAGATE ST973451SS Version: SM03 > Serial number: XXXXXXXX > Device type: disk > Transport protocol: SAS > Local Time is: Mon Sep 22 09:53:23 2008 EDT > Device supports SMART and is Enabled > Temperature Warning Disabled or Not Supported > SMART Health Status: OK > > Current Drive Temperature: 28 C > Drive Trip Temperature: 68 C > Elements in grown defect list: 0 > Vendor (Seagate) cache information > Blocks sent to initiator = 368521559 > Blocks received from initiator = 262427667 > Blocks read from cache and sent to initiator = 3127453 > Number of read and write commands whose size <= segment size = 8536547 > Number of read and write commands whose size > segment size = 0 > Vendor (Seagate/Hitachi) factory information > number of hours powered up = 5885.32 > number of minutes until next internal SMART test = 37 > > Error counter log: > Errors Corrected by Total Correction Gigabytes Total > ECC rereads/ errors algorithm processed uncorrected > fast | delayed rewrites corrected invocations [10^9 bytes] errors > read: 6716196 2 0 6716198 6716198 132.952 0 > write: 0 0 0 0 0 137.144 0 > verify: 32561 0 0 32561 32561 0.734 0 > > Non-medium error count: 2 > > SMART Self-test log > Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] > Description number (hours) > # 1 Background long Completed - 0 - [- - -] > # 2 Background short Completed - 0 - [- - -] > > Long (extended) Self Test duration: 840 seconds [14.0 minutes] > > > --jordan hargrave > Dell Enterprise Custom Engineering > > > |