Menu

#57 Need looping process monitor

open
nobody
None
5
2004-04-25
2004-04-25
Roland Pope
No

In the 'System and server status' module, it would be
good to be able to monitor processes which have
continually reported using more than a specified
percentage of the CPU for a defined period of time.

Ie. If pid 9999 reports more than 90% processor
utilisation over a 20 Minute period of time, then an alert
is triggered.

Discussion

  • Roland Pope

    Roland Pope - 2004-04-25

    Logged In: YES
    user_id=635054

    In addition, it may be handy to be able to narrow the check
    down to a particular process name or all processes EXCEPT a
    particular name (Eg. only look for looping sendmail processes,
    OR don't count ldap processes).

     
  • Roland Pope

    Roland Pope - 2004-04-26

    Logged In: YES
    user_id=635054

    I have attached a monitor which will hopefully achieve the
    desired result. Perhaps it could be included in the status
    module?
    The relevant text values from the lang/en file is included here

    <snip lan/en>
    proccpu_pid=These process(s) have used more than $1% cpu
    for more than $2 seconds $3
    proccpu_cmd=Optional process name
    proccpu_not=Include process which
    proccpu_not0=Match
    proccpu_not1=Don't match
    proccpu_cputhresh=Processor percentage threshold
    proccpu_timethresh=Time period threshold in seconds
    proccpu_ecputhresh=Missing or invalid threshold percentage
    proccpu_etimethresh=Missing or invalid time threshold
    </snip>

     
  • Roland Pope

    Roland Pope - 2004-04-26

    Logged In: YES
    user_id=635054

    Mmm, this monitor might need modifed to deal with multiple
    instances of the monitor with different parameters.
    ie. One instance looking for 'ldap' processes using more than
    50% cpu for 10 minutes and another looking for any process
    using more than 80% cpu for 30 minutes.
    Perhaps creating a unique file for each monitor would solve
    this problem, but we would probably want to clean up this file
    if the monitor instance was deleted. Maybe an hook to call an
    optional entrypoint in a monitor script when it is deleted
    would help with this?

     
  • Roland Pope

    Roland Pope - 2004-05-20

    Logged In: YES
    user_id=635054

    I have attached a new version that allows you to select
    processes which do or don't match a given name AND makes
    the monitor process list file unique to allow you to have
    multiple instances of the monitor AND allows you to terminate
    processes which may be looping according to the thresholds.
    I also modified save_mon.cgi to call a delete and save
    subroutine in the script (See last two subroutines in attached
    file), to provide a way of cleaning up these files when you
    change or delete a monitor.
    I have also added two new english strings
    the new lang/en secion is like so
    <snip>
    proccpu_pid=These process(s) have used more than $1% cpu
    for more than $2 seconds $3
    proccpu_cmd=Optional process name
    proccpu_not=Include processes which
    proccpu_not0=Match
    proccpu_not1=Don't match
    proccpu_cputhresh=Processor percentage threshold
    proccpu_timethresh=Time period threshold in seconds
    proccpu_ecputhresh=Missing or invalid threshold percentage
    proccpu_etimethresh=Missing or invalid time threshold
    proccpu_kill=Kill processes
    proccpu_kill0=no
    proccpu_kill1=yes
    </snip>
    Any chance of getting this monitor integrated into the system
    status module?

     
  • Roland Pope

    Roland Pope - 2004-05-20

    Processor hog monitor script latest version

     
  • Roland Pope

    Roland Pope - 2005-02-09

    Logged In: YES
    user_id=635054

    Jamie, do you have any comments about this? Are you open
    to code submissions into webmin?