SnapRAID / Discussion / Help: RELEASE CANDIDATE for 8.0

Andrea Mazzoleni - 2015-04-05

Hi,

I prepared a release candidate for 8.0 at: http://snapraid.sourceforge.net/rc/

The full list of changes is: https://github.com/amadvance/snapraid/blob/master/HISTORY

I'm mainly interested in comments on the new "up", "down" and "smart" commands. They are intended to spin-up, spin-down, and print a SMART report of the array.

To have them working in Linux, you must have smartctl and hdparm already installed. In Windows, they are provided in the SnapRAID package. In both cases, to get full functionality you must run as root/Administrator.

There is also a new "test-devices" command, that prints the disk mapping that SnapRAID see, with low level devices used by each disk in the array.

These new commands don't make any change, so you can test them even still using SnapRAID 7.1.

Ciao,
Andrea

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Leifi Plomeros - 2015-04-05

Works great on the motherboard SATA ports.

D700, and the parity disks are correctly represented as different physical devices on low level with correct size! :)

All other values also seem correct, including correctly identifing the system disk SSD.

C:\Snapraid>snapraid smart SnapRAID SMART report: Temp Power Error FP Size C OnDays Count TB Serial Device Disk ----------------------------------------------------------------------- - - - 0% - - /dev/pd9 d100 34 110 0 5% 6.0 WD-WXL1H644XT4T /dev/pd4 d200 33 253 0 5% 4.0 WD-WCC4E0900103 /dev/pd2 d300 - - - 0% - - /dev/pd7 d400 35 260 0 5% 4.0 WD-WCC4E0876247 /dev/pd5 d500 32 262 0 5% 4.0 WD-WCC4E0883103 /dev/pd3 d600 - - - 0% - - /dev/pd10 d700 - - - 0% - - /dev/pd8 d700 - - - 0% - - /dev/pd6 d800 35 956 19 42% 2.0 S1UYJ1RZ515380 /dev/pd0 parity - - - 0% - - /dev/pd11 parity - - - 0% - - /dev/pd13 2-parity - - - 0% - - /dev/pd12 2-parity 42 216 0 SSD 0.5 S1DHNSAF405907K /dev/pd1 - The FP column is the estimated probability (in percentage) that the disk is going to fail in the next year. Probability that at least one disk is going to fail in the next year is 52%.

The other disks are connected to LSI 9211-8i which require Smartctl parameters: -d sat

Any chance that you could allow passing of that parameter? Or even try both alternatives with and without the parameter and only present the successfull results in the table?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Andrea Mazzoleni - 2015-04-06
  
  Hi Leifi,
  
  I think that I can add the possibility to specify a manual "-d" option that should be applied to some specific disks.
  
  But to better understand the issue, could you please try the following commands, and report their output ?
  
  smartctl --scan-open -d pd
  smartctl --scan-open -d ata,pd
  smartctl --scan-open -d scsi,pd
  smartctl --scan-open -d usb,pd
  
  Thanks,
  Andrea
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Leifi Plomeros - 2015-04-06
    
    C:\Snapraid>smartctl --scan-open -d pd
    /dev/pd0 -d ata # /dev/pd0, ATA device
    /dev/pd1 -d ata # /dev/pd1, ATA device
    /dev/pd2 -d ata # /dev/pd2, ATA device
    /dev/pd3 -d ata # /dev/pd3, ATA device
    /dev/pd4 -d ata # /dev/pd4, ATA device
    /dev/pd5 -d ata # /dev/pd5, ATA device
    /dev/pd6 -d scsi # /dev/pd6, SCSI device
    /dev/pd7 -d scsi # /dev/pd7, SCSI device
    /dev/pd8 -d scsi # /dev/pd8, SCSI device
    /dev/pd9 -d scsi # /dev/pd9, SCSI device
    /dev/pd10 -d scsi # /dev/pd10, SCSI device
    /dev/pd11 -d scsi # /dev/pd11, SCSI device
    /dev/pd12 -d scsi # /dev/pd12, SCSI device
    /dev/pd13 -d scsi # /dev/pd13, SCSI device
    
    C:\Snapraid>smartctl --scan-open -d ata,pd
    /dev/pd0 -d ata # /dev/pd0, ATA device
    /dev/pd1 -d ata # /dev/pd1, ATA device
    /dev/pd2 -d ata # /dev/pd2, ATA device
    /dev/pd3 -d ata # /dev/pd3, ATA device
    /dev/pd4 -d ata # /dev/pd4, ATA device
    /dev/pd5 -d ata # /dev/pd5, ATA device
    
    C:\Snapraid>smartctl --scan-open -d scsi,pd
    /dev/pd6 -d scsi # /dev/pd6, SCSI device
    /dev/pd7 -d scsi # /dev/pd7, SCSI device
    /dev/pd8 -d scsi # /dev/pd8, SCSI device
    /dev/pd9 -d scsi # /dev/pd9, SCSI device
    /dev/pd10 -d scsi # /dev/pd10, SCSI device
    /dev/pd11 -d scsi # /dev/pd11, SCSI device
    /dev/pd12 -d scsi # /dev/pd12, SCSI device
    /dev/pd13 -d scsi # /dev/pd13, SCSI device
    C:\Snapraid>smartctl --scan-open -d usb,pd
    
    C:\Snapraid>smartctl -a -d sat /dev/pd6
    smartctl 6.3 2014-07-26 r3976 [i686-w64-mingw32-win7(64)-sp1] (sf-6.3-1)
    Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF INFORMATION SECTION ===
    Model Family: Western Digital Red (AF)
    Device Model: WDC WD40EFRX-68WT0N0
    ...
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Andrea Mazzoleni - 2015-04-07
      
      Hi Leifi,
      
      Please one more test. Please report the full output of these two commands. Note that the second one is expected to print the error code of the first one, so you need to run it just after.
      
      smartctl -a /dev/pd6 -r ioctl
      echo %errorlevel%
      
      Anyway, I'm implementing an auto retry with "-d sat" that should work most of the times.
      
      Thanks,
      Andrea
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Leifi Plomeros - 2015-04-07
        
        Hi,
        
        That would have been to easy... :/
        
        C:\Snapraid>smartctl -a /dev/pd6 -r ioctl
        smartctl 6.3 2014-07-26 r3976 [i686-w64-mingw32-win7(64)-sp1] (sf-6.3-1)
        Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
        
        [inquiry: 12 01 00 00 fc 00 ]
        [inquiry: 12 00 00 00 24 00 ]
        
        Probable ATA device behind a SAT layer
        Try an additional '-d ata' or '-d sat' argument.
        
        C:\Snapraid>echo %errorlevel%
        0
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Andrea Mazzoleni - 2015-04-07
        
        Hi Leifi,
        
        Please redownload and retry now. It should work.
        
        Now with error 0 and 2, if no info at all is present, the "-d sat" alternative is automatically retried.
        
        Thanks,
        Andrea
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leifi Plomeros - 2015-04-08
        
        It works!
        
        Temp Power Error FP Size
        C OnDays Count TB Serial Device Disk
        
        38 321 0 58% 4.0 WD-WCC4E0266136 /dev/pd9 d100 35 113 0 5% 6.0 WD-WXL1H644XT4T /dev/pd2 d200 37 256 0 5% 4.0 WD-WCC4E0900103 /dev/pd4 d300 37 160 0 5% 4.0 WD-WCC4EE7P2ZC0 /dev/pd7 d400 36 263 0 5% 4.0 WD-WCC4E0876247 /dev/pd3 d500 34 266 0 5% 4.0 WD-WCC4E0883103 /dev/pd5 d600 46 615 0 5% 2.0 MN1270FA0WSL1D /dev/pd10 d700 35 972 21 5% 2.0 S1UYJ1RZ515272 /dev/pd8 d700 37 261 0 5% 4.0 WD-WCC4E0871186 /dev/pd6 d800 35 959 19 5% 2.0 S1UYJ1RZ515380 /dev/pd0 parity 36 357 0 n/k 2.0 WD-WCC1T0573778 /dev/pd11 parity 42 941 12017 5% 2.0 ML2220F31351EE /dev/pd13 2-parity 39 944 0 5% 2.0 ML2220F30YL2SE /dev/pd12 2-parity 44 220 0 SSD 0.5 S1DHNSAF405907K /dev/pd1 -
        
        The FP column is the estimated probability (in percentage) that the disk
        is going to fail in the next year.
        
        Probability that at least one disk is going to fail in the next year is 75%.
        
        Thank you!
        
        Looks like it may be fan filter cleaning time... :)
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Leifi Plomeros - 2015-04-08
        
        Is it possible to use similar logic for up and down?
        Failing to spin down with "-d sat" returns an error code.
        Failing to spin down without "-d sat" does not return an error code.
        Success always return "Device placed in STANDBY mode" text.
        
        In below examples /dev/pd1 is connected to motherboard SATA
        /dev/pd6 is connected to LSI9211-8i.
        
        C:\Snapraid>smartctl -d sat -s standby,now /dev/pd1
        Read Device Identity failed: IOCTL_SCSI_PASS_THROUGH_DIRECT failed, Error=1
        A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
        C:\Snapraid>echo %errorlevel%
        2
        
        C:\Snapraid>smartctl -s standby,now /dev/pd1
        Device placed in STANDBY mode
        C:\Snapraid>echo %errorlevel%
        0
        
        C:\Snapraid>smartctl -s standby,now /dev/pd6
        Probable ATA device behind a SAT layer
        Try an additional '-d ata' or '-d sat' argument.
        C:\Snapraid>echo %errorlevel%
        0
        
        C:\Snapraid>smartctl -d sat -s standby,now /dev/pd6
        Device placed in STANDBY mode
        C:\Snapraid>echo %errorlevel%
        0
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Andrea Mazzoleni - 2015-04-09
        
        Hi Leifi,
        
        Before I was using "hdparm" to spindown. But yep, your are correct. Using smartctl is likely a better option.
        
        Just implemented it.
        
        I've also added in the snapraid.conf file a new "smartctl" option that allow to configure special option for smartctl for each disk.
        So, if you like, you can set the -d sat for the disks you know that it's needed, without having SnapRAID to retry the command two times.
        
        Note that you can see the exact commands used, generating at log with "-l test.log". This may be useful in testing.
        
        Thanks!
        Andrea
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Leifi Plomeros - 2015-04-05

Snapraid up and down seems to be working for all motherboard SATAs as well

C:\Snapraid>snapraid down
Spindown...
Spundown device '/dev/pd11' for disk 'parity' in 32 ms.
Spundown device '/dev/pd9' for disk 'd100' in 47 ms.
Spundown device '/dev/pd13' for disk '2-parity' in 47 ms.
Spundown device '/dev/pd8' for disk 'd700' in 47 ms.
Spundown device '/dev/pd6' for disk 'd800' in 47 ms.
Spundown device '/dev/pd7' for disk 'd400' in 47 ms.
Spundown device '/dev/pd12' for disk '2-parity' in 47 ms.
Spundown device '/dev/pd10' for disk 'd700' in 47 ms.
Spundown device '/dev/pd5' for disk 'd500' in 453 ms.
Spundown device '/dev/pd3' for disk 'd600' in 453 ms.
Spundown device '/dev/pd2' for disk 'd300' in 453 ms.
Spundown device '/dev/pd4' for disk 'd200' in 640 ms.
Spundown device '/dev/pd0' for disk 'parity' in 1186 ms.

C:\Snapraid>snapraid up
Spinup...
Spunup device '/dev/volb4d94a7b-0a6d-45d9-a6fb-330825f1e449' for disk 'd300' in 15 ms.
Spunup device '/dev/volb52f9f52-c942-460d-a572-25bf397b1347' for disk 'd400' in 31 ms.
Spunup device '/dev/vol81c1c2fb-10d9-11e4-bccb-240a645537ee' for disk 'd700' in 31 ms.
Spunup device '/dev/volf67a1f0f-9f08-11e3-8825-50e549ef3a2e' for disk '2-parity' in 78 ms.
Spunup device '/dev/vol5f8a6c3a-9da8-45f3-88e6-b87ba7fba7e7' for disk 'd100' in 826 ms.
Spunup device '/dev/vol6b374911-77c6-4d9f-802c-45ee11b28c80' for disk 'd800' in 826 ms.
Spunup device '/dev/vol0adaea1e-57a5-4f98-a4c0-70ac2fd12fad' for disk 'd600' in 8346 ms.
Spunup device '/dev/vol3a0a63f8-4f47-4253-a26a-be4f9114101c' for disk 'd500' in 8845 ms.
Spunup device '/dev/vole34a0780-4043-4be3-a353-af5ac3b1f637' for disk 'd200' in 9750 ms.
Spunup device '/dev/vol61c5833b-2886-11e4-9855-240a645537ee' for disk 'parity' in 9859 ms.

I guess the device name on up command could be polished :)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

rubylaser - 2015-04-05

Hello Andrea, this is working well, but how are the Failure Percentages calculated. /dev/sdk is at 100% to fail this year, and it's SMART values (other than age are all good).

~~~~~~
root@backups:~# snapraid smart
SnapRAID SMART report:

Temp Power Error FP Size
C OnDays Count TB Serial Device Disk

29 632 0 5% 3.0 MJ0351YNG7XKZZ /dev/sdg d1 23 443 0 12% 4.0 Z30031ZZ /dev/sdc d2 24 443 0 13% 4.0 Z3002CZZ /dev/sdj d5 24 631 0 6% 3.0 Z3100AN3 /dev/sdh d6 26 437 0 100% 2.0 6YD1R3YF /dev/sdk d8 27 281 2 56% 3.0 44LY9ENGS /dev/sde d10 24 264 0 6% 3.0 Z7P0027C /dev/sdf d11 24 755 1 42% 3.0 MJ1313YNG1LMJC /dev/sdd d12 25 199 0 5% 4.0 PL2331LAG9056J /dev/sdb parity 27 190 0 5% 4.0 PK1334PCGXGYYS /dev/sdi 2-parity

/dev/sdk's SMART info....

root@fileserver:~# smartctl -a /dev/sdk
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.18.6-aufs] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda Green (AF)
Device Model: ST2000DL003-9VT166
Serial Number: 6YD1R3YF
LU WWN Device Id: 5 000c50 0465861eb
Firmware Version: CC3C
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5900 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Apr 5 17:49:25 2015 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 623) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 355) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x30b7) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Vendor Specific ID# ATTRIBUTE_NAME 1 Raw_Read_Error_Rate 3 Spin_Up_Time 4 Start_Stop_Count 5 Reallocated_Sector_Ct 7 Seek_Error_Rate 9 Power_On_Hours 10 Spin_Retry_Count 12 Power_Cycle_Count 183 Runtime_Bad_Block 184 End-to-End_Error 187 Reported_Uncorrect 188 Command_Timeout 189 High_Fly_Writes 190 Airflow_Temperature_Cel 0x0022 191 G-Sense_Error_Rate 192 Power-Off_Retract_Count 0x0032 193 Load_Cycle_Count 194 Temperature_Celsius 195 Hardware_ECC_Recovered 197 Current_Pending_Sector 198 Offline_Uncorrectable 199 UDMA_CRC_Error_Count 240 Head_Flying_Hours 241 Total_LBAs_Written 242 Total_LBAs_Read Data Structure revision number: 10
SMART Attributes with Thresholds:
FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
0x000f 116 099 006 Pre-fail Always - 106740472
0x0003 090 082 000 Pre-fail Always - 0
0x0032 095 095 020 Old_age Always - 5636
0x0033 100 100 036 Pre-fail Always - 0
0x000f 067 060 030 Pre-fail Always - 43007647350
0x0032 089 011 000 Old_age Always - 10489
0x0013 100 100 097 Pre-fail Always - 0
0x0032 100 100 020 Old_age Always - 146
0x0032 099 099 000 Old_age Always - 1
0x0032 100 100 099 Old_age Always - 0
0x0032 100 100 000 Old_age Always - 0
0x0032 100 093 000 Old_age Always - 8590065690
0x003a 100 100 000 Old_age Always - 0
074 060 045 Old_age Always - 26 (Min/Max 21/28)
0x0032 100 100 000 Old_age Always - 0
100 100 000 Old_age Always - 108
0x0032 097 097 000 Old_age Always - 6467
0x0022 026 040 000 Old_age Always - 26 (0 13 0 0 0)
0x001a 021 006 000 Old_age Always - 106740472
0x0012 100 100 000 Old_age Always - 0
0x0010 100 100 000 Old_age Offline - 0
0x003e 200 200 000 Old_age Always - 0
0x0000 100 253 000 Old_age Offline - 113266877544912
0x0000 100 253 000 Old_age Offline - 1243065368
0x0000 100 253 000 Old_age Offline - 3276093544

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

1 Extended offline Completed without error 00% 9474 -

2 Short offline Completed without error 00% 9466 -

3 Short offline Completed without error 00% 9442 -

4 Short offline Completed without error 00% 9418 -

5 Short offline Completed without error 00% 9394 -

6 Short offline Completed without error 00% 9370 -

7 Short offline Completed without error 00% 9346 -

8 Short offline Completed without error 00% 9322 -

9 Extended offline Completed without error 00% 9306 -

10 Short offline Completed without error 00% 9298 -

11 Short offline Completed without error 00% 9274 -

12 Short offline Completed without error 00% 9250 -

13 Short offline Completed without error 00% 9209 -

14 Short offline Completed without error 00% 9175 -

15 Short offline Completed without error 00% 9151 -

16 Short offline Completed without error 00% 9127 -

17 Extended offline Completed without error 00% 9111 -

18 Short offline Completed without error 00% 9103 -

19 Short offline Completed without error 00% 9079 -

20 Short offline Completed without error 00% 9055 -

21 Short offline Completed without error 00% 9031 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
~~~~~

Also, up and down both work well. Finally, I also have IBM m1015 (flashed to IT mode, so it's an 9211-8i) connected to an Intel SAS expander on this box and had no problem with snapraid smart getting values without the -d ata option.

Last edit: rubylaser 2015-04-05
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Andrea Mazzoleni - 2015-04-06
  
  Hi rubylaser,
  
  The problem is the attribute 188. SnapRAID misread the value 8590065690, as in true, it should be masked to 16 bits, resulting in a value of 26.
  
  I've uploaded a new RC version that interpret the value in the correct way. Could you please retry ?
  
  To the failure probability should be less than 100%, but still high, as these timeout command errors are serious ones.
  
  Ciao,
  Andrea
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - rubylaser - 2015-04-06
    
    Thanks Andrea. I tried the new RC, and as you said, the results are still terrible.
    
    root@backups:~# snapraid smart SnapRAID SMART report: Temp Power Error FP Size C OnDays Count TB Serial Device Disk ----------------------------------------------------------------------- 28 633 0 5% 3.0 MJ0351YNG7XK9A /dev/sdg d1 18 444 0 12% 4.0 Z30031NY /dev/sdc d2 18 444 0 13% 4.0 Z3002C2P /dev/sdj d5 17 632 0 6% 3.0 Z3100AN3 /dev/sdh d6 24 438 0 87% 2.0 6YD1R3YF /dev/sdk d8 21 282 2 56% 3.0 44LY9ENGS /dev/sde d10 20 265 0 6% 3.0 Z7P0027C /dev/sdf d11 22 756 1 42% 3.0 MJ1313YNG1LMJC /dev/sdd d12 21 200 0 5% 4.0 PL2331LAG9056J /dev/sdb parity 20 190 0 5% 4.0 PK1334PCGXGYYS /dev/sdi 2-parity The FP column is the estimated probability (in percentage) that the disk is going to fail in the next year. Probability that at least one disk is going to fail in the next year is 98%.
    
    Looks like it's time to replace the old 2TB drive with a new 4TB one. Thanks again for your great work!
    
    Last edit: rubylaser 2015-04-06
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Kevin - 2015-04-10
      
      This is slightly off topic of the main topic but will your script run Snapraid 8.0 out of the gate or will there need to be an update?
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John - 2015-04-06

I've been using the 8 betas for quite a while, all fine.

Thank you again for the "negative wasted" in snapraid status, that's very useful for full disks.

test-devices could be helped by including the path from snapraid.conf

up/down I've never used, I only imagine what they do but I leave the disk to timeout to sleep

It is very good to include the SMART data, even if I don't know what triggers it precisely (I'm sure it is straightforward but I've no time to go through the source now). I do have one drive showing 100% failure next year :-)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Taishan Lin - 2015-04-06

8.0RC seems also changed fix command:

................................................
C:\snapraidXU>snapraid -e fix
Self test...
Loading state from C:/cab/m42/SnapRAID.content...
Scanning disk d0...
...................
Scanning disk d20...
Filtering...
Using 6766 MiB of memory.
Initializing...
Fixing...
100% completed, 12 MB processed in 0:08
Everything OK
.................................................

Great!!! Only 12MB processed.
After Fixing for 8 minutes, it shows:
100% completed, 12 MB processed in 0:08 ...
It would be nice if some progress indicator showing during those 8 minutes.
I think, maybe, the 8 mins has something to do with the size(45GB) of that particular file with one bad 512KB block.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

  -      -       -    -    -  -                /dev/pd37  d0
  -      -       -    -    -  -                /dev/pd25  d1
  -      -       -    -    -  -                /dev/pd33  d2
  -      -       -    -    -  -                /dev/pd16  d3
  -      -       -    -    -  -                /dev/pd20  d4
  -      -       -    -    -  -                /dev/pd45  d5
  -      -       -    -    -  -                /dev/pd41  d6
  -      -       -    -    -  -                /dev/pd29  d7
  -      -       -    -    -  -                /dev/pd38  d8
  -      -       -    -    -  -                /dev/pd26  d9
  -      -       -    -    -  -                /dev/pd34  d10
  -      -       -    -    -  -                /dev/pd17  d11
  -      -       -    -    -  -                /dev/pd21  d12
  -      -       -    -    -  -                /dev/pd46  d13
  -      -       -    -    -  -                /dev/pd42  d14
  -      -       -    -    -  -                /dev/pd30  d15
 42     76       0   5%  4.0  WD-WCC4E1UVKZ58  /dev/pd4   d16
 43     39       0   5%  4.0  WD-WCC4E4ZANDTS  /dev/pd3   d17
 41     61       0 100%  4.0  Z3032VA6         /dev/pd0   d18
 40     63       0   6%  4.0  Z3032X8R         /dev/pd2   d19
 33    132       0   5%  4.0  WD-WCC4E3HPKNHU  /dev/pd5   d20
 30    362       -  42%    -  WD-WCC4E0084809  /dev/pd11  parity
 38    175       -   5%    -  WD-WCC4EFSNC4D2  /dev/pd12  2-parity
 36      4       -   6%    -  Z303802E         /dev/pd10  3-parity
 30    736       -  SSD  0.5  201210230052     /dev/pd1   -
  -      -       -   0% 28.0  -                /dev/pd6   -
  -      -       -   0% 28.0  -                /dev/pd7   -
  -      -       -   0% 21.0  -                /dev/pd8   -
  -      -       -    - 28.0  -                /dev/pd9   -
  -      -       -    -    -  -                /dev/pd13  -
 36    655       0  32%  3.0  W1F0P8A8         /dev/pd19  -
 41    919       0 100%  3.0  WD-WCAWZ1828327  /dev/pd24  -
 37     69       0   6%  3.0  W7300ZA2         /dev/pd32  -
 39    600       0  58%  3.0  W1F0CZM0         /dev/pd44  -
  -      -       -    -    -  -                /dev/pd47  -
  -      -       -    -    -  -                /dev/pd48  -
  -      -       -    -    -  -                /dev/pd49  -
  -      -       -    -    -  -                /dev/pd50  -
 40      3       - 100%    -  Z3037ZWK         /dev/pd51  -

The FP column is the estimated probability (in percentage) that the disk
is going to fail in the next year.

Probability that at least one disk is going to fail in the next year is 100%.

Some problems with smartctl,

Disks housed in USB cages not shown.
Seagate NAS HDD 4T ST4000VN000-1H4168: pd51 3 days old, FP 100%, its (size TB) missing
but another 4 days old: /dev/pd10 FP 6%
Also SG NAS HDD 4T: pd0 61 days old, FP 100%

new 8.0 RC:
C:\snapraidXU>snapraid smart

  -      -       -  n/a    -  -                /dev/pd25  d0
  -      -       -  n/a    -  -                /dev/pd33  d1
  -      -       -  n/a    -  -                /dev/pd41  d2
  -      -       -  n/a    -  -                /dev/pd17  d3
  -      -       -  n/a    -  -                /dev/pd21  d4
  -      -       -  n/a    -  -                /dev/pd44  d5
  -      -       -  n/a    -  -                /dev/pd29  d6
  -      -       -  n/a    -  -                /dev/pd37  d7
  -      -       -  n/a    -  -                /dev/pd26  d8
  -      -       -  n/a    -  -                /dev/pd34  d9
  -      -       -  n/a    -  -                /dev/pd42  d10
  -      -       -  n/a    -  -                /dev/pd18  d11
  -      -       -  n/a    -  -                /dev/pd22  d12
  -      -       -  n/a    -  -                /dev/pd45  d13
  -      -       -  n/a    -  -                /dev/pd30  d14
  -      -       -  n/a    -  -                /dev/pd38  d15
 38     79       0   5%  4.0  WD-WCC4E1UVKZ58  /dev/pd4   d16
 39     41       0   5%  4.0  WD-WCC4E4ZANDTS  /dev/pd3   d17
 38     63       0  10%  4.0  Z3032VA6         /dev/pd0   d18
 37     65       0   6%  4.0  Z3032X8R         /dev/pd1   d19
 28    135       0   5%  4.0  WD-WCC4E3HPKNHU  /dev/pd5   d20
 24    364       -  n/k    -  WD-WCC4E0084809  /dev/pd46  parity
 38    178       -  n/k    -  WD-WCC4EFSNC4D2  /dev/pd13  2-parity
 35      6       -  n/k    -  Z303802E         /dev/pd10  3-parity
 30    738       -  SSD  0.5  201210230052     /dev/pd2   -
  -      -       -  n/k 28.0  -                /dev/pd6   -
  -      -       -  n/k 28.0  -                /dev/pd7   -
  -      -       -  n/k 21.0  -                /dev/pd8   -
  -      -       -  n/a 28.0  -                /dev/pd9   -
 30      5       -  n/k    -  Z3037ZWK         /dev/pd11  -
  -      -       -  n/a    -  -                /dev/pd14  -
 35    657       0  26%  3.0  W1F0P8A8         /dev/pd20  -
 33    922       0  n/k  3.0  WD-WCAWZ1828327  /dev/pd32  -
 34     72       0   6%  3.0  W7300ZA2         /dev/pd40  -
 37    603       0  24%  3.0  W1F0CZM0         /dev/pd43  -
  -      -       -  n/a    -  -                /dev/pd49  -
  -      -       -  n/a    -  -                /dev/pd50  -
  -      -       -  n/a    -  -                /dev/pd51  -
  -      -       -  n/a    -  -                /dev/pd52  -

The FP column is the estimated probability (in percentage) that the disk
is going to fail in the next year.

Probability that at least one disk is going to fail in the next year is 26%.

Hi Taishan,

I just added a new "smartctl" option that allow to pass special configuration option to smartctl.

You can first try to make smartctl to work manually with USB controller. See: https://www.smartmontools.org/wiki/Supported_USB-Devices

Something like:

smartctl -a -d usbjmicron /dev/pd8

(note that usbjmicron is only an example, I don't know that enclosure you have)

Then you can add the options in snapraid.conf. Like:

smartctl d1 -d usbjmicron %s

See the new manual about this new "smartctl" option.

Ciao,
Andrea

RC0409 64bit:

configuration 1: added in conf file:
smartctl d0 -d usbjmicron,0 %s
smartctl d1 -d usbjmicron,0 %s
smartctl d2 -d usbjmicron,0 %s
smartctl d3 -d usbjmicron,0 %s
smartctl d4 -d usbjmicron,0 %s
smartctl d5 -d usbjmicron,0 %s
smartctl d6 -d usbjmicron,0 %s
smartctl d7 -d usbjmicron,0 %s
smartctl d8 -d usbjmicron,0 %s
smartctl d9 -d usbjmicron,0 %s
smartctl d10 -d usbjmicron,0 %s
smartctl d11 -d usbjmicron,0 %s
smartctl d12 -d usbjmicron,0 %s
smartctl parity -d usbjmicron,0 %s
smartctl 2-parity -d usbjmicron,0 %s

snapraid smart:

 34     81       0  10%  3.0  Z1F31CQD         /dev/pd24  d0
 33    924       0  26%  3.0  WD-WCAWZ1828327  /dev/pd32  d1
 36     28       0   6%  3.0  Z1F5PX1W         /dev/pd16  d2
 34     75       0   6%  3.0  W7300ZA2         /dev/pd40  d3
 35    660       0  26%  3.0  W1F0P8A8         /dev/pd20  d4
 36    605       0  24%  3.0  W1F0CZM0         /dev/pd44  d5
 33    537       0   5%  3.0  WD-WMC1T3957622  /dev/pd31  d6
 35    279       0   9%  3.0  Z1F55TF6         /dev/pd39  d7
 34    554       0   5%  3.0  WD-WMC1T4225948  /dev/pd19  d8
 36    555       0   5%  3.0  WD-WMC1T3955681  /dev/pd43  d9
 36    555       0   5%  3.0  WD-WMC1T4141933  /dev/pd27  d10

  -      -       -    -  3.0  WD-WMC1T3958547  /dev/pd35  d11
 36    553       0   5%  3.0  WD-WMC1T4318974  /dev/pd15  d12
 33    487       0  99%  3.0  WD-WMC1T4141687  /dev/pd23  parity
 37    553       0   5%  3.0  WD-WMC1T3958963  /dev/pd28  2-parity
 40     66       0  10%  4.0  Z3032VA6         /dev/pd0   -
 30    741       -  SSD  0.5  201210230052     /dev/pd1   -
 43     81       0   5%  4.0  WD-WCC4E1UVKZ58  /dev/pd2   -
 39     68       0   6%  4.0  Z3032X8R         /dev/pd3   -
 44     44       0   5%  4.0  WD-WCC4E4ZANDTS  /dev/pd4   -
 32    137       0   5%  4.0  WD-WCC4E3HPKNHU  /dev/pd5   -
  -      -       -    - 28.0  -                /dev/pd6   -
  -      -       -    - 28.0  -                /dev/pd7   -
  -      -       -    - 21.0  -                /dev/pd8   -
  -      -       -  n/a 28.0  -                /dev/pd9   -
 35      9       -   6%    -  Z303802E         /dev/pd10  -
 29    366       -  32%    -  WD-WCC4E0084809  /dev/pd11  -
 37    180       -   5%    -  WD-WCC4EFSNC4D2  /dev/pd13  -
  -      -       -  n/a    -  -                /dev/pd14  -
  -      -       -  n/a    -  -                /dev/pd49  -
  -      -       -  n/a    -  -                /dev/pd50  -
  -      -       -  n/a    -  -                /dev/pd51  -
  -      -       -  n/a    -  -                /dev/pd52  -

The FP column is the estimated probability (in percentage) that the disk
is going to fail in the next year.

Probability that at least one disk is going to fail in the next year is 100%.

The only thing strange is disk d11 /dev/pd35 it should be the same as d0 thru d12 and 2 parity disks. But some data missing.

( there are 4 HW raid and 5 drivepool virtual drives with device name only)

configuration 2: added in conf file:

smartctl d0 -d usbjmicron,0 %s
smartctl d1 -d usbjmicron,0 %s
smartctl d2 -d usbjmicron,0 %s
smartctl d3 -d usbjmicron,0 %s
smartctl d4 -d usbjmicron,0 %s
smartctl d5 -d usbjmicron,0 %s
smartctl d6 -d usbjmicron,0 %s
smartctl d7 -d usbjmicron,0 %s
smartctl d8 -d usbjmicron,0 %s
smartctl d9 -d usbjmicron,0 %s
smartctl d10 -d usbjmicron,0 %s
smartctl d11 -d usbjmicron,0 %s
smartctl d12 -d usbjmicron,0 %s
smartctl d13 -d usbjmicron,0 %s
smartctl d14 -d usbjmicron,0 %s
smartctl d15 -d usbjmicron,0 %s
smartctl d16 -d ata %s
smartctl d17 -d ata %s
smartctl d18 -d ata %s
smartctl d19 -d ata %s
smartctl d20 -d ata %s
smartctl parity -d sat %s
smartctl 2-parity -d sat %s
smartctl 3-parity -d sat %s

snapraid smart
SnapRAID SMART report:

 32    542       0  26%  4.0  W300CGLD         /dev/pd25  d0
 32    337       0   8%  4.0  Z301FDFK         /dev/pd33  d1
 33    336       0   8%  4.0  Z301FE0W         /dev/pd41  d2
 35    337 PREFAIL   7%  4.0  Z301FDRY         /dev/pd17  d3
 32    337 PREFAIL   7%  4.0  Z301G1SA         /dev/pd21  d4
 32    336 PREFAIL   8%  4.0  Z301G0AT         /dev/pd45  d5
 33    336 PREFAIL  13%  4.0  Z301FDC6         /dev/pd29  d6
 34    479       0  27%  4.0  Z300RS4D         /dev/pd37  d7
 30    288 PREFAIL  55%  4.0  Z301FE0J         /dev/pd26  d8
 32    391 PREFAIL  10%  4.0  Z300RSQX         /dev/pd34  d9
 31    397       0  13%  4.0  Z300MJ8X         /dev/pd42  d10
 36    486       0  33%  4.0  W300D6V8         /dev/pd18  d11
 36    139       0   5%  4.0  WD-WCC4E0ZY6933  /dev/pd22  d12
 35    182       0   5%  4.0  WD-WCC4ERZFHLXP  /dev/pd46  d13
 33     81       0   5%  4.0  WD-WCC4E1UVKEF5  /dev/pd30  d14
 33     47       0   5%  4.0  WD-WCC4E0NVZL5U  /dev/pd38  d15
 43     81       0   5%  4.0  WD-WCC4E1UVKZ58  /dev/pd2   d16
 42     44       0   5%  4.0  WD-WCC4E4ZANDTS  /dev/pd4   d17
 39     66       0  10%  4.0  Z3032VA6         /dev/pd0   d18
 38     68       0   6%  4.0  Z3032X8R         /dev/pd3   d19
 32    137       0   5%  4.0  WD-WCC4E3HPKNHU  /dev/pd5   d20
 29    366       0  32%  4.0  WD-WCC4E0084809  /dev/pd11  parity
 37    180       0   5%  4.0  WD-WCC4EFSNC4D2  /dev/pd13  2-parity
 35      9       0   6%  4.0  Z303802E         /dev/pd10  3-parity
 30    741       -  SSD  0.5  201210230052     /dev/pd1   -

  -      -       -    - 28.0  -                /dev/pd6   -
  -      -       -    - 28.0  -                /dev/pd7   -
  -      -       -    - 21.0  -                /dev/pd8   -
  -      -       -  n/a 28.0  -                /dev/pd9   -
  -      -       -  n/a    -  -                /dev/pd14  -
 35    660       0  26%  3.0  W1F0P8A8         /dev/pd20  -
 35    924       0  26%  3.0  WD-WCAWZ1828327  /dev/pd32  -
 34     75       0   6%  3.0  W7300ZA2         /dev/pd40  -
 37    605       0  24%  3.0  W1F0CZM0         /dev/pd44  -
  -      -       -  n/a    -  -                /dev/pd49  -
  -      -       -  n/a    -  -                /dev/pd50  -
  -      -       -  n/a    -  -                /dev/pd51  -
  -      -       -  n/a    -  -                /dev/pd52  -

The FP column is the estimated probability (in percentage) that the disk
is going to fail in the next year.

Probability that at least one disk is going to fail in the next year is 97%.

DANGER! SMART is reporting that one or more disks are FAILING!
Please take immediate action!

Jessie Taylor - 2015-04-06

I am not aware of any SMART parameters that allows one to calculate or even guesstimate the probability that a drive will die in the next year. There must be some misunderstanding about what the SMART data represents.

The column about failure probability should be removed, since the numbers it reports are never going to be accurate.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Andrea Mazzoleni - 2015-04-06

Hi Jessie,

The failure probability is an estimation obtained correlating the SMART attributes with the 40000 disks data that Backblaze recently released.

These are the Backblaze data files:
https://www.backblaze.com/blog/hard-drive-data-feb2015/

And here some more easy to read graphs for each attribute:
https://www.backblaze.com/blog-smart-stats-2014-8.html

Obviously, it's only an estimation that could be more or less accurate, but it's a simple way to keep an eye on all the SMART attributes, checking a single value.

Ciao,
Andrea

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

MQMan - 2015-04-07

Not too sure I believe this:

SnapRAID SMART report:

Temp Power Error FP Size
C OnDays Count TB Serial Device Disk

26 2042 0 100% 2.0 5YD282NZ /dev/sdh disk1 28 2478 0 100% 2.0 5YD26ZTT /dev/sdc disk2 27 2267 0 100% 2.0 5YD1Y35Y /dev/sdd disk3 25 2580 0 100% 2.0 5YD2468B /dev/sde disk4 24 37 0 99% 2.0 W1H2Z5X6 /dev/sdf disk5 25 36 0 99% 2.0 W1H2YHVH /dev/sdb disk6 26 2377 0 100% 2.0 5YD24HZP /dev/sdg parity 25 1247 ERROR 5% 0.5 WD-WCAWF0473556 /dev/sda -

The FP column is the estimated probability (in percentage) that the disk
is going to fail in the next year.

Probability that at least one disk is going to fail in the next year is 100%.

disk5 and disk6 are both brand new only added in the last month.

Pasting the full SMART from these would be way TMI. What snapshots of data can I provide here.

Cheers.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

RELEASE CANDIDATE for 8.0

A backup program for disk arrays

Forums

Help

RELEASE CANDIDATE for 8.0

1 Extended offline Completed without error 00% 9474 -

2 Short offline Completed without error 00% 9466 -

3 Short offline Completed without error 00% 9442 -

4 Short offline Completed without error 00% 9418 -

5 Short offline Completed without error 00% 9394 -

6 Short offline Completed without error 00% 9370 -

7 Short offline Completed without error 00% 9346 -

8 Short offline Completed without error 00% 9322 -

9 Extended offline Completed without error 00% 9306 -

10 Short offline Completed without error 00% 9298 -

11 Short offline Completed without error 00% 9274 -

12 Short offline Completed without error 00% 9250 -

13 Short offline Completed without error 00% 9209 -

14 Short offline Completed without error 00% 9175 -

15 Short offline Completed without error 00% 9151 -

16 Short offline Completed without error 00% 9127 -

17 Extended offline Completed without error 00% 9111 -

18 Short offline Completed without error 00% 9103 -

19 Short offline Completed without error 00% 9079 -

20 Short offline Completed without error 00% 9055 -

21 Short offline Completed without error 00% 9031 -

RELEASE CANDIDATE for 8.0

A backup program for disk arrays

Forums

Help

RELEASE CANDIDATE for 8.0 document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

1 Extended offline Completed without error 00% 9474 -

2 Short offline Completed without error 00% 9466 -

3 Short offline Completed without error 00% 9442 -

4 Short offline Completed without error 00% 9418 -

5 Short offline Completed without error 00% 9394 -

6 Short offline Completed without error 00% 9370 -

7 Short offline Completed without error 00% 9346 -

8 Short offline Completed without error 00% 9322 -

9 Extended offline Completed without error 00% 9306 -

10 Short offline Completed without error 00% 9298 -

11 Short offline Completed without error 00% 9274 -

12 Short offline Completed without error 00% 9250 -

13 Short offline Completed without error 00% 9209 -

14 Short offline Completed without error 00% 9175 -

15 Short offline Completed without error 00% 9151 -

16 Short offline Completed without error 00% 9127 -

17 Extended offline Completed without error 00% 9111 -

18 Short offline Completed without error 00% 9103 -

19 Short offline Completed without error 00% 9079 -

20 Short offline Completed without error 00% 9055 -

21 Short offline Completed without error 00% 9031 -

RELEASE CANDIDATE for 8.0