Menu

#5120 SMART Drive Information Not Available for NVMe drives, Ubuntu 18.04

1.880
open
nobody
5
2020-02-12
2018-05-03
No

It looks like no SMART information is returned when there are only NVMe drives in a system. Smartctl works after telling it to check for NVMe drives, but nothing is returned in Webmin.

Related

Bugs: #5120

Discussion

<< < 1 2 (Page 2 of 2)
  • Jamie Cameron

    Jamie Cameron - 2018-11-25

    That makes me wonder if SMART really is supported for NVME drives? Since they are always SSDs, how much sense does drive health monitoring make anyway?

    What does smartctl -H /dev/nvme0 output on your system?

     
  • Dmitry Ogurtsov

    Dmitry Ogurtsov - 2018-11-26

    Well, it surely does make sense: health monitoring for NVMe provides a lot of data:

    SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
    Critical Warning:                   0x00
    Temperature:                        48 Celsius
    Available Spare:                    100%
    Available Spare Threshold:          10%
    Percentage Used:                    0%
    Data Units Read:                    3 623 [1,85 GB]
    Data Units Written:                 24 595 [12,5 GB]
    Host Read Commands:                 93 171
    Host Write Commands:                170 034
    Controller Busy Time:               1
    Power Cycles:                       11
    Power On Hours:                     7
    Unsafe Shutdowns:                   10
    Media and Data Integrity Errors:    0
    Error Information Log Entries:      293
    Warning  Comp. Temperature Time:    0
    Critical Comp. Temperature Time:    0
    Temperature Sensor 1:               48 Celsius
    

    The reason I need it - is to recieve a warning by email when there is some problem with my drive . Webmin has a good notification system, but it needs to see the drive first.
    Here is the requested output. I added two more outputs for you .
    smartctl -H /dev/nvme0

    [sider@nas ~]# sudo smartctl -H /dev/nvme0
    smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-39-generic] (local build)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    

    smartctl -H /dev/nvme0n1

    [sider@nas ~]# sudo smartctl -H /dev/nvme0n1
    smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-39-generic] (local build)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF SMART DATA SECTION ===
    Read NVMe SMART/Health Information failed: NVMe Status 0x6002
    

    smartctl -d nvme,0xffffffff -H /dev/nvme0n1 - working!

    [sider@nas ~]# sudo smartctl -d nvme,0xffffffff -H /dev/nvme0n1
    smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-39-generic] (local build)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
     
  • ochbob

    ochbob - 2018-11-26

    Same for me.

    Debian 9.6
    Webmin 1.90

    Information from nvme-cli or smartctl is working great, but no data from webmin module.

    edit:

    I tried with the latest 1.901.1126.0302, same issue, module does not push any data.

     

    Last edit: ochbob 2018-11-26
  • ochbob

    ochbob - 2018-11-27

    Thank you for this.

    I've commit the latest github version (update_from_repo.sh), but it does not change anything =/

    What exactly the fix should do ?
    (I don't understand properly the code you have changed, sorry.)

     

    Last edit: ochbob 2018-11-27
  • Jamie Cameron

    Jamie Cameron - 2018-11-28

    It should identity all NVME drives as supporting SMART.

     
  • ochbob

    ochbob - 2018-11-28

    That was the conclusion I had arrived to.

    But seems it's not working =/
    Maybe cause my disk name is "nvme0n1" ?
    If you want some tests from my side, let me know.

     
  • ochbob

    ochbob - 2018-11-29

    I tried (commited latest github version), but unfortunately it does not work =/

     
  • Dmitry Ogurtsov

    Dmitry Ogurtsov - 2018-11-30

    The first fix worked for me (this one: https://github.com/webmin/webmin/commit/8f5e734c9ca5704979d83bfdacf4f695334881b5)
    The drive is identified by webmin, shows some info, but wen it comes to SMART data gives an error: Read NVMe SMART/Health Information failed: NVMe Status 0x6002, it's the same error if simply run sudo smartctl -x /dev/nvme0n1

    According to this ticket, I've updated smartmontools to the latest version 6.7 and after update, when I run sudo smartctl -x /dev/nvme0n1, it works okay, without any errors.
    However in webmin it still shows an error, as if webmin uses a previous version, see attached screenshots.

    The update was made according to this guide: https://www.smartmontools.org/wiki/Download#Installfromthesourcetarball
    My concern is this phrase:

    If you don't pass any arguments to ./configure all files will reside under /usr/local to not interfere with files from your distribution.

    Does it have something to do with webmin using the smartmontools from the wrong path?

     
  • Ilia

    Ilia - 2018-11-30

    This is odd as applying the patch makes things work smooth for me with my NVME SSD. Health data is also displayed using smartctl 6.6 2017-11-05 r4594

     
    👍
    1
  • Dmitry Ogurtsov

    Dmitry Ogurtsov - 2018-11-30

    I use Ubuntu server 18.04 LTS, so the latest version from Ubuntu repository is smartctl 6.6 2016-05-31 r4324. I nerver tried r4594, maybe the issue is already solved there.

    According to Christian Franke from smartmontools (link):

    the problem is that this drive requires that the broadcast namespace is specified if SMART/Health and Error Information logs are requested. This issue was unspecified in early revisions of the NVMe standard.
    Option -d nvme,0xffffffff should no longer be necessary with smartctl >= r4671.

    So I installed the latest version available - r4840, which actually works, but not picked up by webmin.
    I also noticed that if I run sudo smartctl it runs r4840, however if I run it without sudo, it runs the old r4324.

    [user@nas ~]# sudo smartctl
    smartctl 6.7 2018-11-27 r4840 [x86_64-linux-4.15.0-39-generic] (local build)
    Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
    
    ERROR: smartctl requires a device name as the final command-line argument.
    
    Use smartctl -h to get a usage summary
    
    [user@nas ~]# smartctl
    smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.15.0-39-generic] (local build)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
    
    ERROR: smartctl requires a device name as the final command-line argument.
    
    Use smartctl -h to get a usage summary
    

    I honestly cannot explain that. Anyway, can you suggest how I handle this issue?

     
  • Dmitry Ogurtsov

    Dmitry Ogurtsov - 2018-12-01

    Okay, I've found the problem, smartctl was just installed in a different path.
    For those with similar problems - just change install path form /usr/local to /usr.
    Anyway, now everything works, thanks for help.

     
  • Jamie Cameron

    Jamie Cameron - 2018-12-01

    Odd that Webmin didn't complain about not being able to run the smartctl command when you opened the SMART module if it was in the wrong location?

     
  • ochbob

    ochbob - 2018-12-01

    Thank you for feedback.
    I could find the fault with this.

    the first patch works great.
    Sorry to saying it is not.

    I had a extra command-line parameters to smartctl webmin module "-d sat" because I had an external USB HDD.

    I have no more this HDD right now, and without argument it works fine :)

    Thank you.

     

    Last edit: ochbob 2018-12-01
  • luison

    luison - 2020-02-12

    Not sure if this has been talked about elsewhere but as a suggestion the tool to monitor nvm disks is:

    nvme-cli

    nvme smart-log /dev/nvme0n1

    Sample output:
    Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
    critical_warning                    : 0
    temperature                         : 43 C
    available_spare                     : 100%
    available_spare_threshold           : 5%
    percentage_used                     : 0%
    data_units_read                     : 22.102.637
    data_units_written                  : 1.822.001
    host_read_commands                  : 129.106.207
    host_write_commands                 : 13.874.365
    controller_busy_time                : 223
    power_cycles                        : 57
    power_on_hours                      : 851
    unsafe_shutdowns                    : 41
    media_errors                        : 0
    num_err_log_entries                 : 12.963.329
    Warning Temperature Time            : 0
    Critical Composite Temperature Time : 0
    Thermal Management T1 Trans Count   : 0
    Thermal Management T2 Trans Count   : 0
    Thermal Management T1 Total Time    : 0
    Thermal Management T2 Total Time    : 0
    

    It would be great to see it supported in Webmin

     

    Last edit: luison 2020-02-12
<< < 1 2 (Page 2 of 2)

Log in to post a comment.