On Mo, 2008-05-19 at 07:11 -0400, Justin Piszcz wrote:
> > >>>> May 9 15:53:09 compute-1-6 smartd[3501]: Device: /dev/sda, FAILED
> > SMART self-check. BACK UP DATA NOW!
> If a HDD is failing, sometimes you cannot read the SMART statistics.
But I cannot access any SMART stuff on any of the nodes. I think simply
because the kernel/smartmontools/controller/hdd combination is not
working on my distribution.
But the big question is: how could the smartd then find out about the
failing harddisk? Is it actually is failing or is it just a false alarm?
Arne
> On Mon, 19 May 2008, Arne Brutschy wrote:
>
> > On Mo, 2008-05-19 at 06:51 -0400, Justin Piszcz wrote:
> >> Are the other hosts using 2.6.9-67/same configuration?
> > Yes, it's a cluster. They are identical.
> >
> >
> > Arne
> >
> >> On Mon, 19 May 2008, Arne Brutschy wrote:
> >>
> >>>
> >>> On Mo, 2008-05-19 at 05:54 -0400, Justin Piszcz wrote:
> >>>>> Redhat derivate (CentOS4) with Linux kernel 2.6.9-67
> >>>>
> >>>> Try using 2.6.25 and re-run the smartctl -a /dev/sda.
> >>> No can't do. Sorry, but it's not possible to change the kernel version
> >>> (not because of technical issues, more like "political" problems).
> >>>
> >>> I just wondered why the smartd issued a warning on this single host
> >>> without being able to get any information from the drive. And why it's
> >>> not doing the same thing on the other hosts.
> >>>
> >>> Arne
> >>>
> >>>
> >>>
> >>>> On Mon, 19 May 2008, Arne Brutschy wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> can anyone explain the behaviour mentioned below for me? I searched the
> >>>>> FAQ and the mailing list, but could not find any answers...
> >>>>>
> >>>>> Thanks
> >>>>> Arne
> >>>>>
> >>>>>> smartd of one of our cluster nodes reported a problem with it's
> >>>>>> harddisk:
> >>>>>>
> >>>>>> May 9 15:53:09 compute-1-6 smartd[3501]: Device: /dev/sda, FAILED SMART self-check. BACK UP DATA NOW!
> >>>>>>
> >>>>>> A smartctl -a /dev/sda gives me a:
> >>>>>>
> >>>>>> smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C)
> >>>>>> 2002-4 Bruce Allen
> >>>>>> Home page is http://smartmontools.sourceforge.net/
> >>>>>>
> >>>>>> Device: ATA ST380815AS Version: 3.AA
> >>>>>> Serial number: 5QZ0C8KW
> >>>>>> Device type: disk
> >>>>>> Local Time is: Fri May 9 16:03:14 2008 CEST
> >>>>>> Device does not support SMART
> >>>>>> Request Sense failed, [Input/output error]
> >>>>>>
> >>>>>> Error Counter logging not supported
> >>>>>>
> >>>>>> Error Events logging not supported
> >>>>>>
> >>>>>> [GLTSD (Global Logging Target Save Disable) set. Enable Save
> >>>>>> with '-S on']
> >>>>>> Device does not support Self Test logging
> >>>>>>
> >>>>>> The harddisk does support smart. But anyways, how does smartd think it's
> >>>>>> broken
> >>>>>> if it can't be queried at all? Is this even possible?
> >>>>>>
> >>>>>> BTW, smartctl give me that output on all my nodes with the same
> >>>>>> configuration.
> >>>>>> But these nodes do not complain about a mysteriously broken disk...
> >>>>>>
> >>>>>> The disk is connected to a on-board Sil3114 SATA Controller (no raid) on
> >>>>>> a
> >>>>>> Tyan Thunder K8SD Pro (S2882-D). No Smart setting in BIOS. We're using a
> >>>>>> Redhat derivate (CentOS4) with Linux kernel 2.6.9-67, loaded module is
> >>>>>> sata_svw.
> >>
|