Menu

#5 top disk sort (5) doesn't work

v1.0 (example)
closed
nobody
None
5
2016-01-07
2015-07-10
David Braun
No

The "5" (top with disk sort) doesn't work in versions 15d 15e or 15f. This is because the "io" member in the topper structure is never initialized.

Discussion

  • Nigel Griffiths

    Nigel Griffiths - 2015-07-10

    Well spotted.
    This feature is supported in the AIX version and lets you know what processes is doing the I/O and useful.
    But I never got round to coding it up - oops!
    I think the only place to get the stats if /proc/PID/io files BUT they are readable only by root or the process owner - which means a non root user can only get the info for their processes = nuts! We have as an example:
    $ cat /proc/23972/io
    rchar: 51067602
    wchar: 13251067
    syscr: 84770
    syscw: 8390
    read_bytes: 950272
    write_bytes: 12800000
    cancelled_write_bytes: 1556480

    read_bytes and write_bytes are actual I/O to storage.

    I have added this to the code for nmon 15g and testing it.
    If you are not root user then it says so on the screen.
    If root then on taking top process and 5, it puts the process read and write KB in the Faults columns and re-orders the top process list.

     
  • David Braun

    David Braun - 2015-07-15

    I mixed up my tickets. See https://sourceforge.net/p/nmon/bugs/6/#4e37/35f6 for the original responses.

    a) Top Process Mode 5 is to order processes based on I/O but if not root most of the processes have no I/O stats so the ordering fails - I don't see the point off having "---" and the order is random. For memory, I think it stays in the previous order. This is confusing miss information. Out of the popular Distro's only Debian/Ubuntu encourages non-root users. Most servers don't have regular users on system admin people as root.

    I don't have a lot of experience with other distros so I'll leave that to you. I had the choice of setting the I/O stats to 0 or -1 if /proc/xxx/io was unreadable. Either one yields a value of 0 for the interval and sorts to the bottom of the list so all the interesting numbers are at the top and presumably displayed. Displaying either "---" or a nonsense value (eg -1) lets the user know something is fishy about the statistic. Another choice is to skip the process but the other numbers are still useful so .... Maybe a disclaimer in the banner ("-1 indicates unavailable data").

    c) Reporting zero process I/O if not root user - argh! This is definitely miss information.
    This sort of thing generates loads of complaints and confusion. In the past I have used -1 which gets complaints but at least it can be explained. Changing all the code to output stats as numbers or strings in in error would triple the code size and soak up CPU cycles so there is no easy answer

    I get your point and mostly agree. It escapes me why the file has such restrictive permissions.

    IMHO displaying "0" is worse than displaying "-1". Displaying "---" or "N/A" is more informative but requires fooling with print formats. The patch I sent changed the number/string mixed fields to '%s' in the print formats and converted the numbers to strings with snprintf(...). Using more format strings would be faster at run time but not by a whole lot. Maintaining yet another set of print formats seemed a bit annoying. I guess it's a matter of picking your poison.

    Again - thanks for your tool. Nice job.

     
  • Nigel Griffiths

    Nigel Griffiths - 2016-01-07
    • status: open --> closed
     

Log in to post a comment.