Michael, please see if the problem persists in the latest CVS head
version, then report back here. Let's avoid RH bugzilla for a little bit
longer... so far this sounds like a smartd bug.
Cheers,
Bruce
On Sat, 25 Aug 2007, Michael Mansour wrote:
> Hi Bruce,
>
>> Hi Michael,
>>
>> This sounds to me like it might be a bug in the smartmontools CCISS
>> Linux interface or in smartd.
>>
>> To help me track this down, could you see if by using smartctl
>> (rather than smartd) you can run selftests on the other disks? If
>> you can, then the bug is probably in smartd. If you can't then the
>> bug is probably in the smartmontools CCISS Linux interface code.
>
> Yes, I can use smartctl on each of the disks:
>
> smartctl -d cciss,0 -t short /dev/cciss/c0d0
> smartctl -d cciss,1 -t short /dev/cciss/c0d0
> smartctl -d cciss,2 -t short /dev/cciss/c0d0
> smartctl -d cciss,3 -t short /dev/cciss/c0d0
> smartctl -d cciss,4 -t short /dev/cciss/c0d0
> smartctl -d cciss,5 -t short /dev/cciss/c0d0
>
> or use "long" and each disk will be checked correctly.
>
> The smartd and smartctl provided in SL5 (RHEL5) is the vendor (Red Hat)
> release, in all the years I've used smartd/smartctl fron SL3, SL4 and now SL5,
> I've never kept the "vendor" release because of issues with it and have always
> installed the cvs version in /usr/local/.
>
> I would have hoped with RHEL5 I would have avoided that but I guess not.
>
> Should I upgrade to the latest cvs version and give it a go? or maybe
> trouble-shoot it with you and raise a bugzilla with Red Hat?
>
> Michael.
>
>> Cheers,
>> Bruce
>>
>> On Wed, 22 Aug 2007, Michael Mansour wrote:
>>
>>> Hi,
>>>
>>> I'm using Scientific Linux 5.0 which comes pre-packaged with:
>>>
>>> # smartctl -V
>>> smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
>>> Home page is http://smartmontools.sourceforge.net/
>>>
>>> smartctl comes with ABSOLUTELY NO WARRANTY. This
>>> is free software, and you are welcome to redistribute it
>>> under the terms of the GNU General Public License Version 2.
>>> See http://www.gnu.org for further details.
>>>
>>> CVS version IDs of files used to build this code are:
>>> Module: atacmdnames.c revision: 1.13 date: 2006/04/12
>>> uses: atacmdnames.h revision: 1.5 date: 2006/04/12
>>> Module: atacmds.c revision: 1.168 date: 2006/04/12
>>> uses: atacmds.h revision: 1.81 date: 2006/04/12
>>> uses: configure.in revision: 1.113 date: 2005/11/27
>>> uses: extern.h revision: 1.41 date: 2006/04/12
>>> uses: int64.h revision: 1.13 date: 2006/04/12
>>> uses: utility.h revision: 1.43 date: 2006/04/12
>>> Module: ataprint.c revision: 1.164 date: 2006/04/12
>>> uses: atacmdnames.h revision: 1.5 date: 2006/04/12
>>> uses: atacmds.h revision: 1.81 date: 2006/04/12
>>> uses: ataprint.h revision: 1.28 date: 2006/04/12
>>> uses: configure.in revision: 1.113 date: 2005/11/27
>>> uses: extern.h revision: 1.41 date: 2006/04/12
>>> uses: int64.h revision: 1.13 date: 2006/04/12
>>> uses: knowndrives.h revision: 1.16 date: 2006/04/05
>>> uses: smartctl.h revision: 1.23 date: 2006/04/12
>>> uses: utility.h revision: 1.43 date: 2006/04/12
>>> Module: knowndrives.c revision: 1.139 date: 2006/04/05
>>> uses: atacmds.h revision: 1.81 date: 2006/04/12
>>> uses: ataprint.h revision: 1.28 date: 2006/04/12
>>> uses: configure.in revision: 1.113 date: 2005/11/27
>>> uses: extern.h revision: 1.41 date: 2006/04/12
>>> uses: int64.h revision: 1.13 date: 2006/04/12
>>> uses: knowndrives.h revision: 1.16 date: 2006/04/05
>>> uses: utility.h revision: 1.43 date: 2006/04/12
>>> Module: os_linux.c revision: 1.82 date: 2006/04/12
>>> uses: atacmds.h revision: 1.81 date: 2006/04/12
>>> uses: configure.in revision: 1.113 date: 2005/11/27
>>> uses: int64.h revision: 1.13 date: 2006/04/12
>>> uses: os_linux.h revision: 1.24 date: 2006/04/12
>>> uses: scsicmds.h revision: 1.57 date: 2006/04/12
>>> uses: utility.h revision: 1.43 date: 2006/04/12
>>> Module: scsicmds.c revision: 1.85 date: 2006/04/12
>>> uses: configure.in revision: 1.113 date: 2005/11/27
>>> uses: extern.h revision: 1.41 date: 2006/04/12
>>> uses: int64.h revision: 1.13 date: 2006/04/12
>>> uses: scsicmds.h revision: 1.57 date: 2006/04/12
>>> uses: utility.h revision: 1.43 date: 2006/04/12
>>> Module: scsiprint.c revision: 1.107 date: 2006/04/12
>>> uses: configure.in revision: 1.113 date: 2005/11/27
>>> uses: extern.h revision: 1.41 date: 2006/04/12
>>> uses: int64.h revision: 1.13 date: 2006/04/12
>>> uses: scsicmds.h revision: 1.57 date: 2006/04/12
>>> uses: scsiprint.h revision: 1.20 date: 2006/04/12
>>> uses: smartctl.h revision: 1.23 date: 2006/04/12
>>> uses: utility.h revision: 1.43 date: 2006/04/12
>>> Module: smartctl.c revision: 1.143 date: 2006/04/12
>>> uses: atacmds.h revision: 1.81 date: 2006/04/12
>>> uses: ataprint.h revision: 1.28 date: 2006/04/12
>>> uses: configure.in revision: 1.113 date: 2005/11/27
>>> uses: extern.h revision: 1.41 date: 2006/04/12
>>> uses: int64.h revision: 1.13 date: 2006/04/12
>>> uses: knowndrives.h revision: 1.16 date: 2006/04/05
>>> uses: scsicmds.h revision: 1.57 date: 2006/04/12
>>> uses: scsiprint.h revision: 1.20 date: 2006/04/12
>>> uses: smartctl.h revision: 1.23 date: 2006/04/12
>>> uses: utility.h revision: 1.43 date: 2006/04/12
>>> Module: utility.c revision: 1.61 date: 2006/04/12
>>> uses: configure.in revision: 1.113 date: 2005/11/27
>>> uses: int64.h revision: 1.13 date: 2006/04/12
>>> uses: utility.h revision: 1.43 date: 2006/04/12
>>>
>>> smartmontools release 5.36 dated 2006/04/12 at 17:39:01 UTC
>>> smartmontools build host: i686-redhat-linux-gnu
>>> smartmontools build configured: 2007/03/27 08:24:02 UTC
>>> smartctl compile dated Mar 27 2007 at 04:24:14
>>> smartmontools configure arguments: '--build=i686-redhat-linux-gnu'
>>> '--host=i686-redhat-linux-gnu' '--target=i386-redhat-linux-gnu'
>>> '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin'
>>> '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share'
>>> '--includedir=/usr/include' '--libdir=/usr/lib' '--libexecdir=/usr/libexec'
>>> '--localstatedir=/var' '--sharedstatedir=/usr/com' '--mandir=/usr/share/man'
>>> '--infodir=/usr/share/info' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386
>>> -mtune=generic -fasynchronous-unwind-tables'
>>> 'build_alias=i686-redhat-linux-gnu' 'host_alias=i686-redhat-linux-gnu'
>>> 'target_alias=i386-redhat-linux-gnu'
>>>
>>> I have a HP Proliant DL380 G3 server with six 146Gb U320 drives.
>>>
>>> All drives are detected correctly when using smartctl.
>>>
>>> I use the following in smartd.conf to schedule the short and long tests:
>>>
>>> /dev/cciss/c0d0 -d cciss,0 -a -o on -S on -s
>>> (S/../../(1|2|3|4|5|6)/05|L/../../7/05) -m my@...
>>> /dev/cciss/c0d0 -d cciss,1 -a -o on -S on -s
>>> (S/../../(1|2|3|4|5|6)/06|L/../../7/06) -m my@...
>>> /dev/cciss/c0d0 -d cciss,2 -a -o on -S on -s
>>> (S/../../(1|2|3|4|5|6)/07|L/../../7/07) -m my@...
>>> /dev/cciss/c0d0 -d cciss,3 -a -o on -S on -s
>>> (S/../../(1|2|3|4|5|6)/08|L/../../7/08) -m my@...
>>> /dev/cciss/c0d0 -d cciss,4 -a -o on -S on -s
>>> (S/../../(1|2|3|4|5|6)/09|L/../../7/09) -m my@...
>>> /dev/cciss/c0d0 -d cciss,5 -a -o on -S on -s
>>> (S/../../(1|2|3|4|5|6)/10|L/../../7/10) -m my@...
>>>
>>> What happens is that when the short tests are kicked off, only disk 5 gets the
>>> test request, none of the other actually get tested ie:
>>>
>>> disk 0 tests get done on disk 5
>>> disk 1 tests get done on disk 5
>>> disk 2 tests get done on disk 5
>>> disk 3 tests get done on disk 5
>>> ... all scheduled disk tests get done on disk 5
>>>
>>> I can manually run:
>>>
>>> smartctl -d cciss,0 -t short /dev/cciss/c0d0
>>>
>>> and disk 0 which get the short test, it's only when smartd tries to kick off
>>> the test automatically that disk 5 gets done.
>>>
>>> My messages log file shows:
>>>
>>> Aug 19 08:09:10 server smartd[32017]: Device: /dev/cciss/c0d0 [cciss_disk_03],
>>> starting scheduled Long Self-Test.
>>> Aug 19 09:09:10 server smartd[32017]: Device: /dev/cciss/c0d0 [cciss_disk_04],
>>> skip since Self-Test already in progress.
>>> Aug 19 10:09:10 server smartd[32017]: Device: /dev/cciss/c0d0 [cciss_disk_05],
>>> skip since Self-Test already in progress.
>>>
>>> and looking at the selftest logs on each disk, disk 5 shows all the tests done
>>> for all the other disks.
>>>
>>> I use this same smartd.conf scheduled config on other HP Proliant servers
>>> without issues, yet this is the only server of this type I have which gives
>>> this problem.
>>>
>>> Any ideas what the problem here could be?
>>>
>>> Thanks.
>>>
>>> Michael.
>>>
>>>
>>> -------------------------------------------------------------------------
>>> This SF.net email is sponsored by: Splunk Inc.
>>> Still grepping through log files to find problems? Stop.
>>> Now Search log events and configuration files using AJAX and a browser.
>>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>>> _______________________________________________
>>> Smartmontools-support mailing list
>>> Smartmontools-support@...
>>> https://lists.sourceforge.net/lists/listinfo/smartmontools-support
>>>
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by: Splunk Inc.
>> Still grepping through log files to find problems? Stop.
>> Now Search log events and configuration files using AJAX and a browser.
>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>> _______________________________________________
>> Smartmontools-support mailing list
>> Smartmontools-support@...
>> https://lists.sourceforge.net/lists/listinfo/smartmontools-support
> ------- End of Original Message -------
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Smartmontools-support mailing list
> Smartmontools-support@...
> https://lists.sourceforge.net/lists/listinfo/smartmontools-support
>
|