Thread: [smartmontools-support] do I have to worry?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hello.

Last week I noticed in the log some messages from smartd
about Offline uncorrectable sectors.

So I took some time and read the docs and run some more
tests, but now I'm at a dead end and I'm not sure if
I have to worry or not.

This happens on two identical file servers.

So, some bits about my environment:

# uname -rms
Linux 2.6.9-89.0.18.ELsmp i686

# cat /etc/redhat-release
CentOS release 4.8 (Final)

smarttools are those from the official centos packages

# smartctl -V
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce 
Allen
Home page is http://smartmontools.sourceforge.net/
...

smartd is enabled and started at boot time.

Each server have two 1TB sata disks in software raid1.

# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/md0              40313848   2750104  35515868   8% /
/dev/sda1               248895     37159    198886  16% /boot
/dev/sdb3               248895     12483    223560   6% /boot2
none                   1037332         0   1037332   0% /dev/shm
/dev/md1             918801540 168135056 703994052  20% /home

I first run:
# smartctl -t long /dev/sda
and
# smartctl -t long /dev/sdb

then, the day after:
# smartctl -s on -o on -S on /dev/sda
and
# smartctl -s on -o on -S on /dev/sdb

and after a while I also run
# smartctl -A /dev/sda
and
# smartctl -A /dev/sdb

I'll attach the output of the above commands, but I believe there's
nothing to worry about that. Shat now worries me is:

# smartctl -l selftest /dev/sda
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce 
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision 
number = 1
Num  Test_Description    Status                  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%     20675 
      61853873

and the command behaves in almost the same way for both drives
in both hosts exept for the LBA_of_first_error value.

I'm still reading docs, but in the meanwhile, can any
kind soul please cast some light into, please?
Do I have to worry or is it all well?

Thank you.
Roberto Nunnari

Thread: [smartmontools-support] do I have to worry?

Disk Inspection and Monitoring

smartmontools-support