Hi All,

What would indicate disk is going to fail soon for Intel 520 MLC Solid State Drives 
behind 3ware 9750 8i raid controllers? 

I have it installed on a 1U Supermicro Intel E5-2400 server.

I can see the disks with behind /dev/twl0 with device type 
3ware,0 for disk 0, 3ware,1 for disk 1 and so on.

Here is an example output

  # smartctl -A -d 3ware,0 /dev/twl0
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.5.0-37-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   000   000   000    Old_age   Always       -       264793324630432
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       7
170 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always       -       0
171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0
174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       7
184 End-to-End_Error        0x0033   100   100   090    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x000f   116   116   050    Pre-fail  Always       -       143221592
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       7
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       46099
226 Load-in_Time            0x0032   100   100   000    Old_age   Always       -       65535
227 Torq-amp_Count          0x0032   100   100   000    Old_age   Always       -       45
228 Power-off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       65535
232 Available_Reservd_Space 0x0033   100   100   010    Pre-fail  Always       -       0
233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always       -       0
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       -       46099
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       -       38542
249 Unknown_Attribute       0x0013   100   100   000    Pre-fail  Always       -       698

I do see message like this

smartd[25718]: Device: /dev/twl0 [3ware_disk_02], SMART Prefailure Attribute: 187 Reported_Uncorrect changed from 119 to 116

Also I am monitoring them with smartd.conf like an excerpt below

/dev/twl0 -d 3ware,0 -a -s L/../../6/00 -m root -M exec /usr/share/smartmontools/smartd-runner

I am monitoring the status of the disks using tw-cli, but that won't helping with predicting
SSD disk failure. Any suggestion what should I monitor to detect that?

Thanks for your help.

--
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?