[Mon-commit] mon/doc CHANGES.mon.cgi,1.1,1.2 README.mon.cgi,1.1,1.2 README.snmpvar.monitor,1.1,1.2 R
Brought to you by:
trockij
From: David N. <vi...@us...> - 2004-11-15 14:45:30
|
Update of /cvsroot/mon/mon/doc In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9218/doc Modified Files: README.cgi-bin mon.8 Added Files: CHANGES.mon.cgi README.mon.cgi README.snmpvar.monitor README.syslog.monitor Log Message: Pulling lots of changes from the 1.0.0pre* branch into the HEAD, to prepare to tag mon-1.1pre1 --- NEW FILE: README.snmpvar.monitor --- snmpvar.monitor by P.Holzleitner What does it do? snmpvar.monitor is a plug-in for the "mon" systems monitoring package written by Jim Trockij (http://www.kernel.org/software/mon). Called by mon, it queries freely configurable values using SNMP, compares them against specified limits and reports any violation. Some parameters that can be monitored (just to give you an idea): Equipment operational status (temperature, fan rotation) UPS Status (line power / battery, minimum line voltage, load % ...) Switch/Router status (interface up, BGP session up, ...) Server status (redundant power supply OK, disk array OK, ...) Status of services (process running, mail queue length, ...) License GNU GPLv2 (http://www.fsf.org/licenses/gpl.txt) - See file COPYING Quick Start: * Make sure you have UCD SNMP 3.6.2+ (libraries) and the Perl SNMP module installed (http://www.cpan.org/misc/cpan-faq.html) * Copy snmpvar.mon to your mon.d directory * Copy snmpvar.def to /etc/mon, add your own variables * Copy snmpvar.cf to /etc/mon and edit to match your needs * Test from mon.d directory with ./snmpvar.monitor -l host1 host2 ... * Test again from mon.d directory with ./snmpvar.monitor host1 host2 ... * Add watch/service to mon.cf, using snmpvar.monitor Commandline options: --varconf=/path/to/snmpvar.def if neither /etc/mon nor /usr/lib/mon/etc --config=/path/to/snmpvar.cf if neither /etc/mon nor /usr/lib/mon/etc --community=your_SNMP_read_community if not 'public' --groups=Power,Disks test only a subset of variables for a host group --timeout=n SNMP GET timeout in seconds --retries=n number of times to retry the SNMP GET --debug tell what config is being useed --mibs='mib1:mib2:mibn' load specified MIBs --list[=linesperpage]] produce human-readable listing, not alarms For every host name passed on the command line, snmpval.monitor looks up the list of variables and corresponding limits in the configuration file (snmpmon.cf). If a --groups option is present, only those variables are checked which are in one of the specified groups. To specify more than one group, separate group names with commas. You can also exclude groups by prefixing the group name(s) with '-'. Don't mix in- and exclusion. Examples: --groups=Power only vars in the Power group --groups=Power,Env vars in the Power or Env group --groups=-Power,-Env all vars except those in Power or Env groups --groups=Power,-Env won't work (only the exclusions) For every such variable, it looks up the OID, description etc. from the variable definition file (snmpvar.def). This monitor looks for configuration files in the current directory, in /etc/mon and /usr/lib/mon/etc. Command line option --varconf overrides the location of the variable definition file, option --config sets the configuration file name. When invoked with the --list option, the output format is changed into a more human-readable form used to check and troubleshoot the configuration. This option must not be used from within MON. Exit values: 0 if everything is OK 1 if any observed value is outside the specified interval 2 in case of an SNMP error (e.g. no response from host) Basic Troubleshooting: use snmpvar.monitor --list option to see variable values use snmpwalk your_hostname public .1 | less to verify SNMP agent The snmpvar.def File: In this file we define variables that can be retrieved via SNMP. In a way, the .def file is snmpvar.monitor's idea of a MIB. Entries consist of a "Variable variable-name" declaration Variable PE4300_TEMP_MB [NOTE: The variable name cannot be "Host" or "FriendlyName"] followed by the mandatory specification of Object ID and Description: OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.1.3 Description Motherboard Temperature It is suggested that OIDs be entered numerically as shown above in order to eliminate the need for having the SNMP libraries compile the relevant MIB files on every invocation of the monitor. By default, this monitor loads no MIBs. If you want to use symbolic OIDs, use the --mibs commandline option to specify which MIBs you need. By the author's convention, an OID describing an array of values, like ifOperStat which takes the interface number as an index, is written with a trailing dot, while OIDs of scalars end in a number. As of version 1.1.1, the monitor will insert the dot before the index if you forgot it in the .def file. Optional Elements of a Variable definition: DefaultIndex 3 4 5 A list of indices to test by default. Let's say the OID is .1.2.3. and DefaultIndex is "18 22 36", then the monitor will retrieve the values of .1.2.3.18, .1.2.3.22 and .1.2.3.36 when testing this variable, and will compare them all against the limits. Where necessary, the DefaultIndex can be overridden for one host/variable combination, using the Index statement in the .cf file. FriendlyName 3 Disk Fan 1 This lets you replace the standard display of "Variable [Index]", e.g. "Fan Speed [5]", with individual labels for each index. The FriendlyName option is typically specified in the .def file for items that have the same name for every use, e.g. component names like in the case of fans, power supplies etc. The same option exists in the .cf file to name a particular variable on a particular host, e.g. to display a line name instead of an interface number on a router. If the FriendlyName string begins with "@", the Description is substituted for the "@". Scale / 10.0 A formula to re-scale the value returned from the host. The expression is appended to the raw value and the resulting expression is evaluated by Perl. The raw value is available as $rawval if necessary. Unit C Used in value display / messages, Decode 1 unknown Decode 2 OK Decode 3 FAILURE Values retrieved through SNMP are often enumerations of status codes. The Decode statement lets you put text labels on these values. DefaultGroup Environment Defines that all, by default, instances of this variable go into the specified group. Individual overrides possible in .cf file. DefaultMin 300 DefaultMax 2000 DefaultEQ 1000 DefaultNEQ 1000 Default alarm limits. See description of Min/Max/EQ/NEQ below. The snmpvar.cf File: In here, you "call up" the variables to be retrieved for a particular host. Entries consist of a "Host host-name" declaration followed by at least one "variable-name [options ...]" line. Host ntserv1 This hostname corresponds to the hostname on the command line, i.e. the hostname you used in MON's hostgroup statement. FOO_FAN_RPM Min 1000 Max 5000 MaxValid 10000 Index 1 2 3 4 This example uses almost all options. It instructs the monitor to retrieve the OID specified under "FOO_FAN_RPM" in the .def file. Min 300 specifies a minimum value, measured >= minimum Max 2000 specifies a maximum value, measured <= maximum EQ 1000 specifies a exact value, measured == maximum NEQ 1000 specifies a exact value, measured != maximum If the measured value is outside of these limits, a failure is reported. To test for "Value = X", use "Min X Max X". MinValid -1 MaxValid 10000 Some monitoring hardware occasionally measures garbage. To avoid triggering an alarm when this happens, you can use MinValid/MaxValid to specify the range (inclusive) of plausible values for this variable. If the measured value exceeds these limits, only a warning will be generated, but no failure will be reported to MON. Group Environment Puts this particular variable into the specified group. Groups are used to test a partial set of the variables specified for a host, by using the --groups= command line option. Index 1 2 3 This tells the monitor which object instances (array elements) to test in case of a non-scalar object. Since the list of indices can be as long as necessary, the Index option must be the last one on the line (after Min X, Max Y etc.) The list specified as DefaultIndex in the .def file entry for this variable is used unless Index is pecified here. When retrieving a non-scalar value, the snmpvar.monitor will normally display the instances (array elements) by appending their index to the description, as in "Line Status [3]". Often, it is desirable to label individual instances in a more mnemonic way. To do this, you can add a number of FriendlyName directives after a variable request, like this: Host firewall IF_OPERSTAT Index 1 2 3 FriendlyName 1 1: Leased Line FriendlyName 2 2: DMZ FriendlyName 3 3: Internal Router In this case, the monitor checks the ifOperStat for interfaces 1, 2, and 3 on host "firewall". If interface 3 were not "up", the monitor would signal a failure of "Internal Router" instead of "ifOperStat [3]". If the FriendlyName string begins with "@", the Description is substituted for the "@". If all instances of this variable having the same index have the same meaning regardless of what host they are on, you can put the FriendlyName statement into te respective variable definition in the .def file instead. The snmpopt.cf File: This optional file is used to pass parameters to the SNMP library. For SNMPv1, this is generally not necessary unless the target's SNMP port differs from the default (161). Note that SNMPv1 community string, timeout and retries can also be specified on the snmpvar.monitor command line, overriding whatever default or configuration file setting. You will need to edit this file in order to use SNMPv3. Index: README.cgi-bin =================================================================== RCS file: /cvsroot/mon/mon/doc/README.cgi-bin,v retrieving revision 1.1.1.1 retrieving revision 1.2 diff -C2 -d -r1.1.1.1 -r1.2 *** README.cgi-bin 9 Jun 2004 05:18:06 -0000 1.1.1.1 --- README.cgi-bin 15 Nov 2004 14:45:18 -0000 1.2 *************** *** 3,15 **** mon.cgi ------- ! mon.cgi used to be a part of the mon distribution, but it is now ! maintained by Andrew Ryan <an...@na...>. The latest ! release can be found at ! ! http://www.nam-shub.com/files/ ! ! or ! ! ftp://ftp.kernel.org/pub/software/admin/mon/contrib/ minotaur --- 3,8 ---- mon.cgi ------- ! mon.cgi is the more advanced web interface to mon, maintained by ! Andrew Ryan <an...@na...>. minotaur --- NEW FILE: README.syslog.monitor --- Readme file for syslog.monitor $Id: README.syslog.monitor,v 1.2 2004/11/15 14:45:18 vitroth Exp $ (Note: This Readme file is an insult to the reader. Better documentation will come as soon as I find more time and fix some more bugs) INTRODUCTION This is a syslog for mon (http://www.kernel.org/software/mon/) by Jim Trocki. It is different from the other monitors, because it is constantly running and communicates with the mon server via Mon::Client over the network, instead of running under mon's supervision. It listens for syslog packets comeing in from the network, parse them, checks them against a rule set and reports to the mon server if necessary. REQUIREMENTS You need to have the following non-std Perl modules installed: Time::HiRes Mon::Client DETAILS syslog.monitor accepts a single command line parameter, the name of the configuration file. All options are explained inside the configuration file, see syslog.conf as an example. At startup, the daemon retrieves a list of all watches from the mon server for which a service "syslog" is defined. We also read the hostgroup definition for this watch from the mon server. (The hostnames are resolved and the result is used to check if the incoming syslog packet is accepted and which host it came from, so you should make sure your hostnames resolve to all IPs from which your systems might send a syslog packet - on a Cisco, you might want to consider "logging source-interface") This basically amounts to: For every hostgroup you want syslog.monitor to accept and monitor syslog packets, define a syslog service. This watch/service is where we later send our traps. For those hosts, add a line like *.* @syslog.monitor.host.name to /etc/syslog.conf. Configure syslog.monitor by editing syslog.conf and following the comments therein. Start syslog.monitor. Restart mon. killall -HUP syslogd on the hosts you want to monitor. Read the logfiles and fix the problems. ;-) AUTHOR Please don't bother Jim with questions relating to this. If this should lead to global warming, code freeze or Elvis's revival, I accept absolutely no responsibility. However, I will gladly receive and incoporate bugfixes and sensible bug reports. Lars Marowsky-Brée <la...@ma...> URL It appears we have made our way to ftp://ftp.kernel.org/pub/software/mon/contrib/ - please use a mirror, as described on http://www.kernel.org/. --- NEW FILE: CHANGES.mon.cgi --- mon.cgi v1.52 21-May-2001 ------------------------- + added check for sufficient Mon::Client version + added optional "watch" keyword to config file that allows users to see only the groups they are configured to be allowed to see, by regex. + added optional keyword "show_watch_strict" that, when set to "yes", will enforce watch keywords strictly, and not allow the mon.cgi user to see any detail about any other hostgroup. + query_groups added summary/ack information to failed services + query_groups: now prints red or yellow as appropriate, instead of just red, for failed services. + added "log in" link to mon.cgi base page + moncgi_get_params: Fixed bug with bug with null values of $monhost and $monport getting through. + fixed moncgi_reset bug - keepstate & no-keepstate are reversed + moncgi_authform: passwd dialog s cleared after unsuccessful password entry. + new function: moncgi_login - allow user to log in prior to having to execute a privileged action. + new config parameter: logo_link. logo_link is a URI that will be linked to the logo picture, if logo is defined. + New function: can_show_group(groupname), to test if a group can be shown according to the "watch" directives. + The following functions were updated to reflect the new watch keyword access control routines : list_alerthist, list_dtlog, query_group, list_disabled, svc_details, mon_test_service, moncgi_test_all, mon_enable, mon_disable, mon_ack + fixed numerous warnings, did some code cleanup and improved comments. + Fixed another mod_perl bug in monhost/monport parsing + Updated moncgi-appsecret.pl, in the util directory, to reflect new code. mon.cgi v1.51 22-Mar-2001 ------------------------- + Fixed taint-checking problem with monhost and monport args (Mon::Client was complaining under TaintMode/-T). mon.cgi v1.50 15-Mar-2001 ------------------------- + Config file parsing support was not working properly. This has been fixed, and a new subroutine was introduced: initialize_config_globals. mon.cgi v1.49 14-Mar-2001 ------------------------- + Add test_config option on main menu bar (new 0.38.21 command) + change reset to single button, with follow-up page, giving two choices -- reset keepstate and reset. + new function - moncgi_reset to allow users to choose which type of reset they would like to execute. + Patch from Ed Ravin (er...@pa...) to accomodate a site-specific custom toolbar row and site-specific menu commands. + added a optional config file that lets users specify their own mon.cgi parameters. + added TVA color scheme to the distro (from tb...@tv...) + Use HTML::Entities to escape HTML submitted as ack messages, avoiding cross-site scripting attacks/javascript and ensure proper encoding of characters entered as ack messages. HTML scrubbing can be skipped by setting the variable untaint_ack_msgs to "no". + remove all <pre>'s and replace with <font face="$fixed_font_face">. Important messages were often getting cut off the screen by the use of <pre>. + make $monhost and $monport optional CGI params as 'h' and 'p' respectively + added "test service" and "test-all" to query_group page mon.cgi v1.48 01-Dec-2000 ------------------------- + Have ability to do mass disabling/enabling of hosts and services in hostgroup. + query_group: have radio button for enabled/disabled status (facilitates mass en/disabling) + query_group: added a table on to show services for that group, enabled/disabled with radio button. + query_group: now includes service status on this page + query_group: mass dis/enabling of svcs requires a new function, mon_state_change + svc_details: widened the table + main: Command matching changed to use exact matches instead of regex matches (duh). + main: fix bug with Revision tag in $VERSION + list_disabled: Also added mass disabling + mon_state_change_enable_only: new function to support list_disabled mass re-enabling. + list_pids: cleaned up function and formatting + added mon_state_change function for mass state changing + added mon_list_opstatus function + query_opstatus: moved legend to below main table + query_opstatus: changed legend to use bgcolor instead of font color + query_opstatus: ack message is now included in summary + query_opstatus: increased main table width to 100% + query_opstatus: can now test svcs from this page + ability to do multiple tests at the same time for a single hostgroup + moncgi_test_all: new function to test all svcs in group + Ran mon.cgi through 'tidy' (http://www.w3.org/People/Raggett/tidy/) for improved HTML compliance. Most common pages are OK now (I think) except for table summary attributes. I'll get to them eventually. + added last_ok time for failed services in "Last Check" column + color of UNCHECKED services is now midnight blue by default, unchecked services are now readable in the default color scheme! mon.cgi v1.46 20-Aug 2000 ------------------------- + Fixed bug in list_dtlog that would show min and max failure time as "-1" seconds if no failures had been seen on that service. Also the table is now not printed at all instead of being a 0-row table. + Made it easier for users to get themselves out of the situation where they enter in a valid username and an invalid password. + Made the summary info MUCH easier to see when a service is in the failure state. + alert_details is now "svc_details", a much more descriptive name, since it shows success as well as failure details. + svc_details [nee alert_details] got a little bit of a cleanup (not much). + list_dtlog now has a configurable maximum number of entries per page that it will display, defaults at 100. Large downtime logs would not render well in most browsers, and would not render at all with Netscape's table drawing algorithm. + Added optional $monport argument, in case you don't run mon on port 2583. + Trap watches are now correctly handled and printed (thanks to Ed Ravin <er...@pa...> for the bug report and fix). + Fixed bug in pp_sec that would cause "1 days" to be printed out instead of "1 day". mon.cgi v1.45 05-Jun 2000 ------------------------- + query_opstatus: Built an "amber level" alert for services that have failed but never issued an alert + query_opstatus: Changed "Last Checked" and "Est. Next Check" times to be deltas instead of absolute times, both relative to servertime and not localtime. + Added ACK (and re-ack) feature + query_opstatus: Added additional visual warnings if scheduler is not running or cannot be contacted. + Changed default app secret + Button bar at top of each page is cleaner + Fixed bug with scheduler falsely claiming to be stopped if you try to stop the scheduler and aren't authenticated, or if the server is not running. + Fixed bug where multiple auth failures are displayed if a user is not authenticated (should only notify once) + Made it easier to not hit "reset server" button accidentally + Made font on ONDS check times size -1 + Show the downtime log as an option on query_group + Fixed "test immediately" stuff so it tests and then shows right status + list_opstatus: hostgroup column no longer goes white if svc is unchecked + alert_details is MUCH spiffier + alert_details now checks to see if a monitor for that service/group is currently running, and as such, the status reported is subject to change very soon. + Added more decriptive text to service status table in alert_details alert_details. + Changed default return screen on enable_service to be alert_details if that's where the user last came from. + Added new 0.38-18 data types for alert_details + list_dtlog: Display median in addition to mean failure time to lessen effects of downtime outliers. + Added a Refresh button on alert_details page + Cleaned up the list_disabled function + Got rid of backwards() function, unused relic from old mon.cgi + Fixed the META REFRESH tags so that it works on all browsers (put it in the header where it belongs) and handles more cases (alert_details, test_service) + Started using servertime in places instead of time on local web server + Visual enhancements for this version submitted by Brian Doherty <bdo...@ma...> + Fixed a bug in the "failure-free operation %" calculation if you had an extremely large number of failures in a time period, % could show up as negative. mon.cgi v1.38 18-Feb 2000 ------------------------- + MAJOR speedup, only use one Mon connection per page view. Pages typically load 2-3x faster. + list_opstatus in Summary mode is now more brief. All "OK, Non-Disabled Services" (ONDS) for any given hostgroup are now aggregated in a single line. If you monitor a lot of services on each of your host groups, this will save you a lot of screen real estate. Services which are disabled and/or failing are still broken out individually. + added FAILED flag to Status box , moved DISABLED flag, so mon.cgi works with Lynx & w3m or any other text browser that supports tables (only Lynx and w3m tested, looks great with w3m by the way). + changed default path of cookie to "/" to avoid lynx complaining about "invalid cookie path". + changed alert_details to use a table, include "view downtime log" + on query_group page, turn box gray if host is disabled. + fixed a div0 bug if you have no entries in your dtlog and ask to view it + changed disabled host in query_group to sort alpha even when hosts are disabled. + alert_details function now auto-detects failure/success, doesn't need to be told which one to look for ("test service immediately" would show inconsistent results from this behavior, since it is impossible to know the results of a test before you run it!) mon.cgi v.1.35 -------------- + Downtime log viewing/querying support. + Disabled services/hosts/watches now appear as gray-colored boxes on the main display screen. This makes it easier to see what is disabled. + Fixed loadstate and savestate bugs again. These commands now work. + I finally have sort of a release process, so hopefully my releases will not be littered with formatting code that is specific to my environment, and they will run fine out of the box when you get them. + Fixed a few routines to work with changing ways Mon::Client asks you to do things. + Also, if you are logged in as an authenticated user (not the "default user", if one is defined), your username will appear on each page, so you always know who you are authenticated as. + Added a logout button. + Added ability to do "reset keepstate" as well as "reset" from the web interface. + The command bar is now 2 lines instead of one. Even on my 21" monitor, 13 buttons was too much to have on 1 line (let alone my poor 800x600 laptop LCD!). + Mon::Client::test is broken in v0.7. To make it work in the way that mon.cgi expects it to, change line 1470 in Client.pm v0.7 from: > if ($what !~ /^alert|startupalert|upalert$/) { to < if ($what !~ /^monitor|alert|startupalert|upalert$/) { mon.cgi 1.32.1.2 01-Feb 2000 ---------------------------- + Fixed loadstate and savestate to not be NOOPs. + Established a "default" user for when authentication was required but you don't want to make users log in just to list status. + Along with the default user, there is also now a "switch user" feature that offers the user the chance to re-authenticate to a user of higher privilege if they are denied the running of a command due to a lack of authorization. + Fixed HTML bugs with hardcoded colors in font and table tags scattered throughout code (patch courtesy of Martha H Greenberg <ma...@MI...>, thanks!). This makes it possible to run mon.cgi in colors other than the default scheme. mon.cgi users take note however, testing color schemes is not part of my QA process (such as it is) and so if you find something broken, let me know and I'll fix it. Index: mon.8 =================================================================== RCS file: /cvsroot/mon/mon/doc/mon.8,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** mon.8 14 Jun 2004 11:08:21 -0000 1.2 --- mon.8 15 Nov 2004 14:45:18 -0000 1.3 *************** *** 329,332 **** --- 329,339 ---- global configuration variable. + .TP + .B MON_CFBASEDIR + The directory where configuration files should be kept, + as indicated by the + .I cfbasedir + global configuration variable. + .P "fping.monitor" should return an exit status of 0 if it *************** *** 371,375 **** unless .B no_comp_alerts ! is defined in the period section. If an alert was already sent within the last .B alertevery --- 378,383 ---- unless .B no_comp_alerts ! is defined in the period section. An upalert will only be sent ! if the previous state is a failure. If an alert was already sent within the last .B alertevery *************** *** 377,381 **** .I unless the summary output from the current monitor program differs from the last ! monitor process. Otherwise, send an alert using each alert program listed for that period. The .B "observe_detail" --- 385,390 ---- .I unless the summary output from the current monitor program differs from the last ! monitor process. ! Otherwise, send an alert using each alert program listed for that period. The .B "observe_detail" *************** *** 390,393 **** --- 399,409 ---- The reasoning is that if the summary output changes, then a significant event occurred and the user should be alerted. + The "strict" argument to alertevery will suppress both + comparing the output from the previous monitor run to the current + and prevent a successful return value of the monitor from + resetting the alertevery timer. For example, "alertevery 24h strict" + will only send out an alert once every 24 hours, regardless of + whether the monitor output changes, or if the service stops and then + starts failing. .SH ALERT\ PROGRAMS *************** *** 511,514 **** --- 527,537 ---- global configuration variable. + .TP + .B MON_CFBASEDIR + The directory where configuration files should be kept, + as indicated by the + .I cfbasedir + global configuration variable. + .P The first line from standard input must be used as a brief summary *************** *** 684,691 **** .TP - .BI "snmpport = " portnum - Set the SNMP port that the server binds to. - - .TP .BI "serverbind = " addr --- 707,710 ---- *************** *** 702,709 **** .TP - .BI "snmp =" {yes|no} - Turn on/off SNMP support (currently unimplemented). - - .TP .BI "dtlogfile = " file .I file --- 721,724 ---- *************** *** 949,952 **** --- 964,969 ---- .B service followed by a word which is the tag for this service. + This word must be unique among all services defined for the + same watch group. The components of a service are an interval, monitor, and *************** *** 1187,1191 **** The .B period ! keyword has two forms. The first takes an argument which is a period specification from Patrick Ryan's --- 1204,1208 ---- The .B period ! definition has two forms. The first takes an argument which is a period specification from Patrick Ryan's *************** *** 1207,1212 **** parameters. .TP ! .BI alertevery " timeval [observe_detail]" The .B alertevery --- 1224,1235 ---- parameters. + Period definitions, in either the first or second form, must be unique within + each service definition. For example, if you need to define two + periods both for "wd {Sun-Sat}", then one or both of the period definitions + must specify a label such as "period t1: wd {Sun-Sat}" and + "period t2: wd {Sun-Sat}". + .TP ! .BI alertevery " timeval [observe_detail | strict]" The .B alertevery *************** *** 1230,1234 **** "observe_detail" is the last argument, then both the summary and detail output lines will be considered when comparing the ! output of successive failures. Please refer to the .B "ALERT DECISION LOGIC" section for a detailed explanation of how alerts are suppressed. --- 1253,1262 ---- "observe_detail" is the last argument, then both the summary and detail output lines will be considered when comparing the ! output of successive failures. ! If the string "strict" is the last argument, then the output ! of the monitor or the state change of the service will have ! no effect on when alerts are sent. That is, "alertevery 24h strict" ! will send only one alert every 24 hours, no matter what. ! Please refer to the .B "ALERT DECISION LOGIC" section for a detailed explanation of how alerts are suppressed. --- NEW FILE: README.mon.cgi --- Introduction to mon.cgi -------------------------------------------------------- This interface, along with mon itself, is available from ftp://ftp.kernel.org/pub/software/admin/mon/ Development versions of mon.cgi can be found at http://www.nam-shub.com/files/ -------------------------------------------------------- mon.cgi is a web-based GUI for mon. Its purpose is twofold: 1) To provide an easy-to-read visual display of all the status items that mon keeps track of, and 2) To provide an easy-to-use web administration interface to allow users to perform all mon administration tasks from any web browser. This package and the documentation assumes that you have at least a basic familiarity with mon. ----------------------------------------------------------------- mon.cgi v.1.52 21-May-2001 by Andrew Ryan <an...@na...> This interface, along with mon itself, is available from ftp://ftp.kernel.org/pub/software/admin/mon/ Development versions of mon.cgi can be found at http://www.nam-shub.com/files/ ----------------------------------------------------------------- This is the latest stable version of mon.cgi, meant to be used only with mon 0.38-21 and above, and a version of Mon::Client that is 0.11 or higher. The chief reason that you will need the new version is for the "test config" functionality. This release has 4 new features of note: 1) Access control. Using the 'watch' keyword in the config file, you can restrict access to a particular configuration on a per-hostgroup basis. 'watch' keywords can be regular expressions. Original idea and keyword name stolen from monshow :) 2) 'watch' keywords can either be implemented "softly" -- by default only certain hostgroups are shown, but all can be accessed -- or "strictly" -- only the hostgroups explicitly allowed by 'watch' keywords can be accessed in any way. Using strict access control, an organization using mon to watch systems belonging to multiple customers to be able to segregate those different customers' monitoring completely. 3) There's now a login button. The people have spoken! 4) mon.cgi now checks for the proper version of Mon::Client before it starts. This was a major support problem. Plus many other bug fixes and small improvements, as usual. This release should be considered stable until proven otherwise :) Please see the CHANGES file for more information about this release. Thanks to all who report bugs, submit patches, and give feedback. Andrew Ryan <an...@na...> Installing mon.cgi ------------------ Instructions for installing mon.cgi are located in the header of the mon.cgi file itself. Roughly speaking, the order of events is: 1) Install mon and get it working, set up monpasswd and auth.cf files and get them verifiably working if you're using mon.cgi authentication (hint: you should be!). 2) Install a web server, preferably Apache, and preferably with mod_perl built in. Start the web server and verify that it works. 3) Put mon.cgi in your cgi-bin directory and make sure it is executable by the apache user (make it 0755 or 0555). 4) Edit your mon.cgi file to change default values to match your environment (e.g. contact email, your company logo, your company name, etc.). 5) If you're requiring users to log in (highly recommended), you must change the default app secret variable $app_secret in your copy of mon.cgi, and install the Crypt::TripleDES module from CPAN on the machine which will be running mon.cgi. 6) If you want to easily customize the look and feel of mon.cgi, as well as various other configuration options, copy the sample mon.cgi.cf file (in the /config directory of this distribution) into a location where your webserver can read it, and edit the line beginning '$moncgi_config_file = ""' to reflect the path to your config file. You can then change the look and feel of mon.cgi, as well as implement access controls, directly from this file. mon.cgi Design Goals -------------------- 1) Provide 100% of the functionality of mon in a graphical user interface. Ideally, there will be some things that the GUI is better for, and inevitably, some things that the command line will always win out for. 2) Maintain 100% compatibility with mon and Mon::Client. If a patch to mon or Mon::Client is required to get a piece of mon.cgi functionality working, we write it, submit it, and get it folded in to the main distribution before making it official in mon.cgi. 3) Expose mon to the largest number of people possible in the most useful way. It is the author's belief that mon is a very useful piece of monitoring software, and it is also my belief that the best way to insure the growth and support of this software is to expose it to a large number of people in your organization in a way that will cause them to reach the same conclusion. A web client is the most universal way to achieve this goal at the present time, as a web client can be run on any network that mon would be. 4) Simplicity and lightness. In other words: Compatibility on a large number of client browser sizes, versions, and resolutions; No frames! ; Adhering to as many of the standard good usability conventions as possible ; Keeping mon.cgi all one file, with a very short setup time ; No special modules required past those needed to run mon, and optional additional modules kept to a minimum ; 100% text browser compatibility ; Performance and speed ; Low resource utilization. Sometimes these design goals work against one another, but hopefully we come out ahead when tradeoffs are made. Alternatives to mon.cgi ----------------------- If you don't like mon.cgi but you would still like a web GUI, you have 2 alternatives. Your first alternative is Jim's monshow, which ships with mon in the clients/ subdirectory of the mon distribution. The second alternative is Gilles Lamiral's Minotaure, which can be found at ftp://ftp.kernel.org/pub/software/admin/mon/contrib/. Both of these are fully functional and may suit your needs better than mon.cgi. You are encouraged to take a look at them both and decide which is best for you. SITE CUSTOMIZATION ------------------ mon.cgi has always been "customizable," in that the source was available and you were encouraged to substitute your own parameters (e.g., mon host, mon port, company logo, etc.). But this meant that with each new version, you had to go back and re-edit the source code. Not a big deal, but still something of a pain. As of v1.49, mon.cgi includes some features which are meant to facilitate these changes and make site-specific customizations easier to perform, especially as mon and mon.cgi continue to evolve. Creating Your Own Config File ----------------------------- Previous to v.1.49 of mon.cgi, you could customize the look of the page, but all customizations had to be done in the source itself. This has numerous disadvantages, so 1.49 introduces an *optional* config file which will be read only as necessary and will allow you to specify custom values for parameters without having to touch the source code each time. You can still edit the source each time if you want, but if you want to set up a config file, follow these steps: 1) Copy the config file (included with the mon.cgi distribution) config/mon.cgi.cf to a location of your choice. It's best to start with a sample config file, because the config file format is very simple, and it will give you a chance to see how it works and experiment with parameters. 2) Edit the mon.cgi source code to find the line that specifies the variable "$moncgi_config_file". Change the value to the filesystem path of your copy of your mon.cgi config file. 3) Now you can edit the config file and make changes at will. Every time you change the mtime of the file (e.g., by saving it in a text editor, or touch'ing the file), mon.cgi will re-read the config file and the changes will take effect. If there are errors in parsing the config file, they will go to STDERR, which in most setups will end up in your web server's error log. Look in the errors file if your config isn't working like you expect it to work. Adding A New Row And Custom Commands To The Command Button Bar -------------------------------------------------------------- Adding a new row to the command button bar, with corresponding custom commands, is quite a bit more involved than the relatively simple matter of changing a config file. If you've developed, or are interested in developing your own custom commands, however, this functionality might be just what you needed. In the following example, we add a command called "ack_all" to the button bar, and also add the routine to do the ack'ing. The actual guts of the ack_all routine aren't included, but the goal of these instructions is to give you enough to start off. The first step is to create your own moncgi_custom_print_bar function. A stub function exists in the mon.cgi code, and the below code shows you how you would put in your own function that has one button, labeled "Acknowledge All Failures". Sample moncgi_custom_print_bar subroutine: sub moncgi_custom_print_bar { # # This is a sample routine, which adds a third row to the # command table, with one command: "Acknowledge All Failures" # my ($face)= (@_); $webpage->print("<tr>\n"); $webpage->print("\t<td colspan=7 align=center><font FACE=\"$face\"><a href=$url?${monhost_and_port_args}command=ack_all>Acknowledge All Failures</a></font></td>\n"); $webpage->print("</tr>\n"); } The next step is to tell mon.cgi that you are using your own custom commands, by creating your own moncgi_custom_commands subroutine. Again, there is a sample function in the mon.cgi code which you can replace with your own. Sample moncgi_custom_commands subroutine: sub moncgi_custom_commands { if ($command eq "ack_all") { # # Set up the page # &setup_page("Acknowledge All Alarms"); # # Note: you would have to write the "ack all" # command yourself! &moncgi_ack_all; } else { # # We didn't find anything, return # return 0; } return 1; # we did find something, suppress further command processing } The last step is to create the actual subroutines which will do the custom work you want them to do (assuming you weren't just calling existing commands in a different way. In our example, this means we have to write a function that actually goes out and acks all existing failures. We won't do this here, but hopefully this gives you an idea of how to proceed. sub moncgi_ack_all { # # Here is where the actual code to do the "ack all" would go # } When future releases of mon.cgi come out, you can copy and paste your custom subroutines and be up and running with the new version in minimal time. At least, that is what this was designed for. Credits ------- The current maintainer is Andrew Ryan <an...@na...>. Report all bugs to him or the mon users mailing list. + Originally by: Arthur K. Chan <ar...@al...> + Based on the Mon program by Jim Trocki <tr...@tr...>. http://www.kernel.org/software/mon/ + Rewritten to support Mon::Client, mod_perl, taint mode, authentication, the strict pragma, and other visual/functional enhancements by Andrew Ryan <an...@na...>. + Downtime logging contributed by Martha H Greenberg <ma...@mi...> + Site customization extensions by Ed Ravin <er...@pa...> + The contributions of members of the mon-users mailing list have been invaluable in many ways. |