mon-commit Mailing List for mon (Page 13)
Brought to you by:
trockij
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(36) |
Jul
(21) |
Aug
(9) |
Sep
(1) |
Oct
(2) |
Nov
(12) |
Dec
(6) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(4) |
Feb
(10) |
Mar
(5) |
Apr
(22) |
May
(17) |
Jun
(3) |
Jul
(4) |
Aug
(10) |
Sep
(2) |
Oct
(1) |
Nov
(2) |
Dec
(2) |
2006 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(6) |
Oct
|
Nov
|
Dec
(2) |
2007 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
(22) |
Jun
(19) |
Jul
(7) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(6) |
2008 |
Jan
(1) |
Feb
(1) |
Mar
(3) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
(4) |
Sep
|
Oct
(7) |
Nov
(1) |
Dec
|
2009 |
Jan
(2) |
Feb
(9) |
Mar
(1) |
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(5) |
2010 |
Jan
(2) |
Feb
(1) |
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
(2) |
Apr
(1) |
May
(2) |
Jun
(2) |
Jul
(65) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jim T. <tr...@us...> - 2004-07-12 13:17:26
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv13878 Modified Files: Tag: mon-1-0-0pre1 CHANGES Log Message: updated to show changes from mon-1-0-0pre1 Index: CHANGES =================================================================== RCS file: /cvsroot/mon/mon/CHANGES,v retrieving revision 1.2.2.1 retrieving revision 1.2.2.2 diff -C2 -d -r1.2.2.1 -r1.2.2.2 *** CHANGES 18 Jun 2004 14:40:10 -0000 1.2.2.1 --- CHANGES 12 Jul 2004 13:17:07 -0000 1.2.2.2 *************** *** 1,4 **** --- 1,83 ---- $Id$ + Changes between mon-1.0.0pre1 and mon-1.0.0pre3 + Mon Jul 12 09:12:29 EDT 2004 + ----------------------------------------------- + + -changed README to refer to the new, more sensible name for the perl module + client, which is mon-client + + -applied eric's updates to INSTALL and added a mention of monshow and mon.cgi as + the web interfaces + + -added eric's rpm spec file (i removed the patches because they are no longer + needed) + + -added lmb's syslog.monitor (a nifty hack) + + -added 'alertevery strict' code and docs, updated the README and INSTALL to + mention CVS, updated CREDITS + + -incorporated mon.cgi 1.52 + + -minor addition to alert behavior explanation in mon.8 + + -in dialin.monitor.wrap.c, return the exit status of execv (if it fails, that is) + + -fixed path to perl in file_change.monitor and smtp3.monitor + + -added some rcs tags to identify the file versions + + -handle_trap_timeout now calls process_event, and it works fine with + alert/upalert/alertevery/etc. as shown by my testing + + -received traps now reset the trap timeout counter, and fixed some + other stuff wrt trap timeouts + + -added sub process_event and made proc_cleanup and handle_trap use it + so that the alert mgmt code is shared rather than in two places. i tested + as much of it as i could and all seems to work well now, especially + upalert, alertafter, alertevery with traps. + + -added per-service "_monitor_duration" variable which records how many + seconds the previous monitor took to execute. this is available via + "list opstatus". if no monitor has executed yet then the value is -1. + + -added per-service "_monitor_running" variable whose value is 0 or 1 + depending on whether the monitor is currently running for that service. + + -removed gunk from handle_trap regarding the various TRAP_COLDSTART, etc. + processing, since most of it was a bad idea anyway, or at least as far as + i could tell. traps and their exit values are now processed exactly as + monitors are, which simplifies things greatly and adds to more intuitive + functionality. this means the "spc" value in a trap is now ignored. + + -fixed some args processing in call_alert + + -fixed a bug which would prevent alerts or upalerts + from being sent when call alerts is passed the "output" + argument whose value is undef + + -remove usage of parse_line in trap processing + (backported from mon 1.1 code) + + -make esc_str escape spaces in order to be compatible with monperl-1-0-0pre1 + + -added list of all possible client commands to moncmd + + -added --community to set the snmp community in reboot.monitor + + -patch to traceroute.monitor from meekj + added StateDir, TracerouteOptions, StopAt config options + some bugfixes to config file parsing + reap children to avoid defunct processes + added timeout alarm + + -up_rtt.monitor + added -r to log individual rtts, better error reporting for tcp and udp check + + + + Changes between mon-0.99.3-47 and mon-1.0.0pre1 ----------------------------------------------- |
From: Jim T. <tr...@us...> - 2004-07-12 12:46:33
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8355 Modified Files: Tag: mon-1-0-0pre1 CREDITS INSTALL Added Files: Tag: mon-1-0-0pre1 mon.spec Log Message: applied eric's updates to INSTALL and added a mention of monshow and mon.cgi as the web interfaces added eric's rpm spec file (i removed the patches because they are no longer needed) added lmb's syslog.monitor (a nifty hack) Index: CREDITS =================================================================== RCS file: /cvsroot/mon/mon/CREDITS,v retrieving revision 1.1.1.1.2.1 retrieving revision 1.1.1.1.2.2 diff -C2 -d -r1.1.1.1.2.1 -r1.1.1.1.2.2 *** CREDITS 17 Jun 2004 20:28:49 -0000 1.1.1.1.2.1 --- CREDITS 12 Jul 2004 12:46:23 -0000 1.1.1.1.2.2 *************** *** 103,104 **** --- 103,109 ---- an...@my... mon.cgi v1.32 and later, bug reports and fixes. + + Eric Sorenson + er...@tr... + Documentation and RPM spec updates + --- NEW FILE: mon.spec --- # # spec file for package mon (Version 1.0.0pre3) # # Copyright (c) 2004 SUSE LINUX AG, Nuernberg, Germany. # This file and all modifications and additions to the pristine # package are under the same license as the package itself. # # Please submit bugfixes or comments via http://www.suse.de/feedback/ # BuildRequires: bash bzip2 cpio cpp diffutils file filesystem findutils grep groff gzip info m4 make man patch sed tar texinfo autoconf automake binutils gcc libtool perl rpm Name: mon Version: 1.0.0pre3 Release: 1 Summary: The mon network monitoring system License: GPL Group: System/Monitoring URL: http://www.kernel.org/software/mon/ Source: http://www.kernel.org/pub/software/admin/mon/%{name}-%{version}.tar.bz2 Source1: http://www.kernel.org/pub/software/admin/mon/mon-client-%{version}.tar.bz2 Requires: perl Requires: perl(Time::Period) Requires: perl-Convert-BER Requires: fping Requires: perl-libwww-perl BuildRoot: %{_tmppath}/%{name}-%{version}-build %define filelist %{name}-%{version}-filelist %description "mon" is a tool for monitoring the availability of services. Services may be network-related, environmental conditions, or nearly anything that can be tested with software. It is extremely useful for system administrators, but not limited to use by them. It was designed to be a general-purpose problem alerting system, separating the tasks of testing services for availability and sending alerts when things fail. To achieve this, "mon" is implemented as a scheduler which runs the programs which do the testing, and triggering alert programs when these scripts detect failure. None of the actual service testing or reporting is actually handled by "mon". These functions are handled by auxillary programs. Authors: -------- Jim Trocki <tr...@tr...> %prep ################################################################### %setup -q %setup -T -D -a 1 ################################################################### %build cd mon.d make cd ../mon-client-%{version} %{__perl} Makefile.PL `%{__perl} -MExtUtils::MakeMaker -e ' print qq|PREFIX=%{buildroot}%{_prefix}| if \$ExtUtils::MakeMaker::VERSION =~ /5\.9[1-6]|6\.0[0-5]/ '` %{__make} ################################################################### %install rm -rf %{buildroot} mkdir -p %{buildroot}/%{_libdir}/mon/alert.d mkdir -p %{buildroot}/%{_sbindir} mkdir -p %{buildroot}/%{_mandir}/man1 mkdir -p %{buildroot}/%{_libdir}/mon/mon.d mkdir -p %{buildroot}/%{_localstatedir}/lib/mon mkdir -p %{buildroot}/%{_libdir}/mon/utils mkdir -p %{buildroot}/%{_sysconfdir}/mon mkdir -p %{buildroot}/%{_sysconfdir}/init.d mkdir -p %{buildroot}/%{_sysconfdir}/logrotate.d mkdir -p ./examples cp mon %{buildroot}/%{_sbindir}/ cp -a ./alert.d/ %{buildroot}/%{_libdir}/mon/ cp ./clients/moncmd %{buildroot}/%{_sbindir}/moncmd cp ./clients/monshow %{buildroot}/%{_sbindir}/monshow cp -a ./doc/*.1 %{buildroot}/%{_mandir}/man1/ mv ./etc/very-simple.cf %{buildroot}/%{_sysconfdir}/mon/mon.cf mv ./etc/auth.cf %{buildroot}/%{_sysconfdir}/mon mv ./etc/S99mon %{buildroot}/%{_sysconfdir}/init.d/mon cp -a ./etc/* ./examples cp -a ./mon.d/{*.monitor,*.wrap} %{buildroot}/%{_libdir}/mon/mon.d/ cp -a ./utils/ %{buildroot}/%{_libdir}/mon/ mkdir -p %{buildroot}/sbin ln -sf ../etc/init.d/mon %{buildroot}/sbin/rcmon cd mon-client-%{version} && %{makeinstall} `%{__perl} -MExtUtils::MakeMaker -e ' print \$ExtUtils::MakeMaker::VERSION <= 6.05 ? qq|PREFIX=%{buildroot}%{_prefix}| : qq|DESTDIR=%{buildroot}| '` cd .. # clean up after perl module install - remove special files find %{buildroot} -name "perllocal.pod" -o -name ".packlist" -o -name "*.bs" |xargs -i rm -f {} # build filelist echo "%defattr(0664,root,root)" > %filelist find %{buildroot} -type f -printf "/%%P\n" | grep -v "man/man" >> %filelist [ -z %filelist ] && { echo "ERROR: EMPTY FILE LIST" exit -1 } ################################################################### %files -f %filelist %doc %{_mandir}/man1/moncmd.1* %doc %{_mandir}/man1/monshow.1* %doc %{_mandir}/man3/Mon::* %doc CHANGES COPYING COPYRIGHT CREDITS INSTALL KNOWN-PROBLEMS README %doc TODO VERSION mon.lsm %doc ./doc/README.* %doc ./doc/globals %doc ./examples ################################################################### %clean if [ -z "${RPM_BUILD_ROOT}" -a "${RPM_BUILD_ROOT}" != "/" ] then rm -rf $RPM_BUILD_ROOT fi rm -rf $RPM_BUILD_ROOT ################################################################### %preun if [ -r %{_localstatedir}/run/mon.pid ]; then /etc/init.d/mon stop fi ################################################################### %post if [ -d %{_localstatedir}/log -a ! -f %{_localstatedir}/log/mon_history.log ]; then touch %{_localstatedir}/log/mon_history.log fi ################################################################### %postun if [ "$1" = "0" -a -f %{_localstatedir}/log/mon_history.log ]; then rm -f %{_localstatedir}/log/mon_history.log fi %changelog -n mon * Thu Jul 07 2004 - er...@tr... - update to 1.0.0pre2, remove suse-ness * Mon Mar 01 2004 - hm...@su... - building as nonroot-user * Fri Feb 27 2004 - ku...@su... - Cleanup neededforbuild - fix compiler warnings * Mon Feb 10 2003 - lm...@su... - Fixed path to comply with FHS. * Fri Oct 18 2002 - lm...@su... - Fix for Bugzilla #21086: init script had a broken path and syntax error. * Tue Aug 20 2002 - lm...@su... - Fix for Bugzilla # 17936; PreRequires corrected. * Mon Aug 12 2002 - lm...@su... - Perl dependencies updated for Perl 5.8.0 * Fri Jul 26 2002 - lm...@su... - Perl dependencies adjusted to comply with SuSE naming scheme * Fri Jul 26 2002 - lm...@su... - Adapted from Conectiva to UnitedLinux - init script cleanup * Wed Jul 24 2002 - Fábio Olivé Leite <ol...@co...> - Version: mon-0.99.2-1ul - Adapted for United Linux * Sat Jul 20 2002 - Claudio Matsuoka <cl...@co...> - Version: mon-0.99.2-3cl - updated dependencies on perl modules to lowercase names * Thu May 16 2002 - Fábio Olivé Leite <ol...@co...> - Version: mon-0.99.2-2cl - Added %%attr to %%{_libdir}/mon/*, so that the helper scripts are executable Closes: #5522 (aparente problema com as permissões) - Changed initscript to use gprintf Closes: #4172 (Internacionalização (?)) * Fri Dec 28 2001 - Ricardo Erbano <er...@co...> - Version: mon-0.99.2-1cl - New upstream relase 0.99.2 * Sat Nov 17 2001 - Claudio Matsuoka <cl...@co...> - Version: mon-0.38.20-6cl - fixed doc permissions * Thu Jun 21 2001 - Eliphas Levy Theodoro <el...@co...> - Version: mon-0.38.20-5cl - fixed initscript - /usr/lib/mon -> /usr/sbin (Closes: #3792) - added requires for perl-Convert-BER - added post{,un} scripts to handle logfile mon_history.log * Fri Mar 23 2001 - Luis Claudio R. Gonçalves <lcl...@co...> - Version: mon-0.38.20-4cl - fixed the initscript (it was missing a "-f" switch) * Tue Oct 31 2000 - Arnaldo Carvalho de Melo <ac...@co...> - %%{_sysconfdir}/mon is part of this package - small cleanups * Thu Sep 28 2000 - Fábio Olivé Leite <ol...@co...> - Wrong version in the mon-perl dependency... * Thu Sep 21 2000 - Fábio Olivé Leite <ol...@co...> - Updated to 0.38.20. * Fri Jun 16 2000 - Fábio Olivé Leite <ol...@co...> - Fixed TIM alert, added history file, added logrotate script * Mon Jun 12 2000 - Fábio Olivé Leite <ol...@co...> - Added an alert via TIM Celular cellphones * Thu Jun 08 2000 - Fábio Olivé Leite <ol...@co...> - Made the %%preun nicer * Thu Jun 01 2000 - Fábio Olivé Leite <ol...@co...> - New spec format * Mon Apr 17 2000 - Fábio Olivé Leite <ol...@co...> - Added a new monitor (initscript.monitor) * Fri Apr 14 2000 - Fábio Olivé Leite <ol...@co...> - Added proxy support to http.monitor * Thu Apr 13 2000 - Fábio Olivé Leite <ol...@co...> - Fixed a small bug in the init script - Added scripts to alert via Mobi pagers and Global Telecom cellphones * Mon Apr 10 2000 - Fábio Olivé Leite <ol...@co...> - Initial RPM packaging Index: INSTALL =================================================================== RCS file: /cvsroot/mon/mon/INSTALL,v retrieving revision 1.1.1.1.2.4 retrieving revision 1.1.1.1.2.5 diff -C2 -d -r1.1.1.1.2.4 -r1.1.1.1.2.5 *** INSTALL 12 Jul 2004 01:33:31 -0000 1.1.1.1.2.4 --- INSTALL 12 Jul 2004 12:46:23 -0000 1.1.1.1.2.5 *************** *** 179,181 **** ------------- ! (mention something about mon.cgi and monshow) --- 179,193 ---- ------------- ! This distribution contains two web interfaces: monshow and mon.cgi. monshow is ! a simple report-only tool which supports configurable "views" of the mon ! configuration. monshow also operates as a textmode report generator. ! ! mon.cgi, however, supports the full functionality of mon, including the ability ! to disable/enable groups and hosts and services, acknowledge failed services, ! show alert and downtime history, authenticate users, among many other things. ! ! To install monshow, simply copy clients/monshow into your web server's cgi-bin ! path and name it "monshow.cgi". You may want to read the man page in the doc/ ! directory so that you can understand how to configure a "view" to your liking. ! ! To install mon.cgi, follow the instructions found in doc/README.mon.cgi. |
From: Jim T. <tr...@us...> - 2004-07-12 12:46:33
|
Update of /cvsroot/mon/mon/doc In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8355/doc Added Files: Tag: mon-1-0-0pre1 README.syslog.monitor Log Message: applied eric's updates to INSTALL and added a mention of monshow and mon.cgi as the web interfaces added eric's rpm spec file (i removed the patches because they are no longer needed) added lmb's syslog.monitor (a nifty hack) --- NEW FILE: README.syslog.monitor --- Readme file for syslog.monitor $Id: README.syslog.monitor,v 1.1.2.1 2004/07/12 12:46:24 trockij Exp $ (Note: This Readme file is an insult to the reader. Better documentation will come as soon as I find more time and fix some more bugs) INTRODUCTION This is a syslog for mon (http://www.kernel.org/software/mon/) by Jim Trocki. It is different from the other monitors, because it is constantly running and communicates with the mon server via Mon::Client over the network, instead of running under mon's supervision. It listens for syslog packets comeing in from the network, parse them, checks them against a rule set and reports to the mon server if necessary. REQUIREMENTS You need to have the following non-std Perl modules installed: Time::HiRes Mon::Client DETAILS syslog.monitor accepts a single command line parameter, the name of the configuration file. All options are explained inside the configuration file, see syslog.conf as an example. At startup, the daemon retrieves a list of all watches from the mon server for which a service "syslog" is defined. We also read the hostgroup definition for this watch from the mon server. (The hostnames are resolved and the result is used to check if the incoming syslog packet is accepted and which host it came from, so you should make sure your hostnames resolve to all IPs from which your systems might send a syslog packet - on a Cisco, you might want to consider "logging source-interface") This basically amounts to: For every hostgroup you want syslog.monitor to accept and monitor syslog packets, define a syslog service. This watch/service is where we later send our traps. For those hosts, add a line like *.* @syslog.monitor.host.name to /etc/syslog.conf. Configure syslog.monitor by editing syslog.conf and following the comments therein. Start syslog.monitor. Restart mon. killall -HUP syslogd on the hosts you want to monitor. Read the logfiles and fix the problems. ;-) AUTHOR Please don't bother Jim with questions relating to this. If this should lead to global warming, code freeze or Elvis's revival, I accept absolutely no responsibility. However, I will gladly receive and incoporate bugfixes and sensible bug reports. Lars Marowsky-Brée <la...@ma...> URL It appears we have made our way to ftp://ftp.kernel.org/pub/software/mon/contrib/ - please use a mirror, as described on http://www.kernel.org/. |
From: Jim T. <tr...@us...> - 2004-07-12 12:46:33
|
Update of /cvsroot/mon/mon/etc In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8355/etc Added Files: Tag: mon-1-0-0pre1 syslog-monitor.conf Log Message: applied eric's updates to INSTALL and added a mention of monshow and mon.cgi as the web interfaces added eric's rpm spec file (i removed the patches because they are no longer needed) added lmb's syslog.monitor (a nifty hack) --- NEW FILE: syslog-monitor.conf --- # Configuration file for syslog.monitor # $Id: syslog-monitor.conf,v 1.1.2.1 2004/07/12 12:46:24 trockij Exp $ ############################################################################# # Which timeout to set for select()ing on the input socket. # You really do not wish to play with this. # select_timeout 10 # Log level (just like syslog you know;) loglevel 6 # If undefined, will write to stdout # You better specify an absolute path here. # logfile /var/log/syslog.monitor # Where copies of incoming syslog messages get written to. # In the filename, you can define the following substitutions: # %H = gets replaced with the hostname # %L = gets replaced with the syslog level as a string # %l = same, but as a number # %F = syslog facility (local0, kern, ...) # %G = hostgroup the host belongs to # %D = date at which the message was received, in ISO 8601 (1999-04-03) syslogfile /var/log/syslog.%H.%F.%D # If set, will make syslog.monitor fork and go into the background as soon # as possible. # Be aware that the program will refuse to daemonize if you do not set a logfile. # daemon_mode mon_host cherusker.bi.teuto.net # Set these if necessary # mon_user # mon_pass # IP number on which to listen for incomeing UDP packets bind_ip 0.0.0.0 # port number (you almost certainly do not want to touch this) # bind_port 514 # Define a check called "emerg" check emerg # A slightly more elaborate description, which is sent to the mon server # as part of the trap desc Emergencies # The period which is monitored period 60m # How often this check _must_ trigger within said period. # Set to -1 to disable. min -1 # How often this check might occur at max within the period. max 3 # If this is set, no further matches will be checked if this check matched. # Use this carefully. # final # The check itself. Evaluated within Perl (), you can do powerful stuff # here. The current message is referenced by $$r. # Parameters you might want to match on: # $$r{'src_port'} - The source port from which the packert was sent. # $$r{'src_ip'} - The source IP. # $$r{'host'} - The hostname, resolved using the cache build # at startup. # $$r{'level'} - numeric syslog level of the message. (0-7) # $$r{'Level'} - syslog level as a string (ie 'crit') # $$r{'facility'} - Facility (ie 'local0' etc) # $$r{'msg'} - The text part of the message # $$r{'time'} - The unixtime at which the message was received, # $$r{'group'} - The group the host sending this message # belongs to pattern ($$r{'level'} <=3) # A "catch-all" - we really should receive at least one line within 15m, # But more than 1000 might be strange... check all desc All period 15m min 200 max 10000 final pattern (1) # Relating to hostgroup unix: group unix # For each host in the hostgroup unix, run a separate instance of each # check listed here (references the check defined above) per-host emerg # For the _entire_ hostgroup, run these checks: per-group all # Only on this host, run these: # on-host donar.bi.teuto.net emerg-kern |
From: Jim T. <tr...@us...> - 2004-07-12 12:46:33
|
Update of /cvsroot/mon/mon/utils In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8355/utils Added Files: Tag: mon-1-0-0pre1 syslog.monitor Log Message: applied eric's updates to INSTALL and added a mention of monshow and mon.cgi as the web interfaces added eric's rpm spec file (i removed the patches because they are no longer needed) added lmb's syslog.monitor (a nifty hack) --- NEW FILE: syslog.monitor --- (This appears to be a binary file; contents omitted.) |
From: Jim T. <tr...@us...> - 2004-07-12 01:33:44
|
Update of /cvsroot/mon/mon/doc In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8582/doc Added Files: Tag: mon-1-0-0pre1 CHANGES.mon.cgi README.mon.cgi Log Message: incorporated mon.cgi 1.52 --- NEW FILE: CHANGES.mon.cgi --- mon.cgi v1.52 21-May-2001 ------------------------- + added check for sufficient Mon::Client version + added optional "watch" keyword to config file that allows users to see only the groups they are configured to be allowed to see, by regex. + added optional keyword "show_watch_strict" that, when set to "yes", will enforce watch keywords strictly, and not allow the mon.cgi user to see any detail about any other hostgroup. + query_groups added summary/ack information to failed services + query_groups: now prints red or yellow as appropriate, instead of just red, for failed services. + added "log in" link to mon.cgi base page + moncgi_get_params: Fixed bug with bug with null values of $monhost and $monport getting through. + fixed moncgi_reset bug - keepstate & no-keepstate are reversed + moncgi_authform: passwd dialog s cleared after unsuccessful password entry. + new function: moncgi_login - allow user to log in prior to having to execute a privileged action. + new config parameter: logo_link. logo_link is a URI that will be linked to the logo picture, if logo is defined. + New function: can_show_group(groupname), to test if a group can be shown according to the "watch" directives. + The following functions were updated to reflect the new watch keyword access control routines : list_alerthist, list_dtlog, query_group, list_disabled, svc_details, mon_test_service, moncgi_test_all, mon_enable, mon_disable, mon_ack + fixed numerous warnings, did some code cleanup and improved comments. + Fixed another mod_perl bug in monhost/monport parsing + Updated moncgi-appsecret.pl, in the util directory, to reflect new code. mon.cgi v1.51 22-Mar-2001 ------------------------- + Fixed taint-checking problem with monhost and monport args (Mon::Client was complaining under TaintMode/-T). mon.cgi v1.50 15-Mar-2001 ------------------------- + Config file parsing support was not working properly. This has been fixed, and a new subroutine was introduced: initialize_config_globals. mon.cgi v1.49 14-Mar-2001 ------------------------- + Add test_config option on main menu bar (new 0.38.21 command) + change reset to single button, with follow-up page, giving two choices -- reset keepstate and reset. + new function - moncgi_reset to allow users to choose which type of reset they would like to execute. + Patch from Ed Ravin (er...@pa...) to accomodate a site-specific custom toolbar row and site-specific menu commands. + added a optional config file that lets users specify their own mon.cgi parameters. + added TVA color scheme to the distro (from tb...@tv...) + Use HTML::Entities to escape HTML submitted as ack messages, avoiding cross-site scripting attacks/javascript and ensure proper encoding of characters entered as ack messages. HTML scrubbing can be skipped by setting the variable untaint_ack_msgs to "no". + remove all <pre>'s and replace with <font face="$fixed_font_face">. Important messages were often getting cut off the screen by the use of <pre>. + make $monhost and $monport optional CGI params as 'h' and 'p' respectively + added "test service" and "test-all" to query_group page mon.cgi v1.48 01-Dec-2000 ------------------------- + Have ability to do mass disabling/enabling of hosts and services in hostgroup. + query_group: have radio button for enabled/disabled status (facilitates mass en/disabling) + query_group: added a table on to show services for that group, enabled/disabled with radio button. + query_group: now includes service status on this page + query_group: mass dis/enabling of svcs requires a new function, mon_state_change + svc_details: widened the table + main: Command matching changed to use exact matches instead of regex matches (duh). + main: fix bug with Revision tag in $VERSION + list_disabled: Also added mass disabling + mon_state_change_enable_only: new function to support list_disabled mass re-enabling. + list_pids: cleaned up function and formatting + added mon_state_change function for mass state changing + added mon_list_opstatus function + query_opstatus: moved legend to below main table + query_opstatus: changed legend to use bgcolor instead of font color + query_opstatus: ack message is now included in summary + query_opstatus: increased main table width to 100% + query_opstatus: can now test svcs from this page + ability to do multiple tests at the same time for a single hostgroup + moncgi_test_all: new function to test all svcs in group + Ran mon.cgi through 'tidy' (http://www.w3.org/People/Raggett/tidy/) for improved HTML compliance. Most common pages are OK now (I think) except for table summary attributes. I'll get to them eventually. + added last_ok time for failed services in "Last Check" column + color of UNCHECKED services is now midnight blue by default, unchecked services are now readable in the default color scheme! mon.cgi v1.46 20-Aug 2000 ------------------------- + Fixed bug in list_dtlog that would show min and max failure time as "-1" seconds if no failures had been seen on that service. Also the table is now not printed at all instead of being a 0-row table. + Made it easier for users to get themselves out of the situation where they enter in a valid username and an invalid password. + Made the summary info MUCH easier to see when a service is in the failure state. + alert_details is now "svc_details", a much more descriptive name, since it shows success as well as failure details. + svc_details [nee alert_details] got a little bit of a cleanup (not much). + list_dtlog now has a configurable maximum number of entries per page that it will display, defaults at 100. Large downtime logs would not render well in most browsers, and would not render at all with Netscape's table drawing algorithm. + Added optional $monport argument, in case you don't run mon on port 2583. + Trap watches are now correctly handled and printed (thanks to Ed Ravin <er...@pa...> for the bug report and fix). + Fixed bug in pp_sec that would cause "1 days" to be printed out instead of "1 day". mon.cgi v1.45 05-Jun 2000 ------------------------- + query_opstatus: Built an "amber level" alert for services that have failed but never issued an alert + query_opstatus: Changed "Last Checked" and "Est. Next Check" times to be deltas instead of absolute times, both relative to servertime and not localtime. + Added ACK (and re-ack) feature + query_opstatus: Added additional visual warnings if scheduler is not running or cannot be contacted. + Changed default app secret + Button bar at top of each page is cleaner + Fixed bug with scheduler falsely claiming to be stopped if you try to stop the scheduler and aren't authenticated, or if the server is not running. + Fixed bug where multiple auth failures are displayed if a user is not authenticated (should only notify once) + Made it easier to not hit "reset server" button accidentally + Made font on ONDS check times size -1 + Show the downtime log as an option on query_group + Fixed "test immediately" stuff so it tests and then shows right status + list_opstatus: hostgroup column no longer goes white if svc is unchecked + alert_details is MUCH spiffier + alert_details now checks to see if a monitor for that service/group is currently running, and as such, the status reported is subject to change very soon. + Added more decriptive text to service status table in alert_details alert_details. + Changed default return screen on enable_service to be alert_details if that's where the user last came from. + Added new 0.38-18 data types for alert_details + list_dtlog: Display median in addition to mean failure time to lessen effects of downtime outliers. + Added a Refresh button on alert_details page + Cleaned up the list_disabled function + Got rid of backwards() function, unused relic from old mon.cgi + Fixed the META REFRESH tags so that it works on all browsers (put it in the header where it belongs) and handles more cases (alert_details, test_service) + Started using servertime in places instead of time on local web server + Visual enhancements for this version submitted by Brian Doherty <bdo...@ma...> + Fixed a bug in the "failure-free operation %" calculation if you had an extremely large number of failures in a time period, % could show up as negative. mon.cgi v1.38 18-Feb 2000 ------------------------- + MAJOR speedup, only use one Mon connection per page view. Pages typically load 2-3x faster. + list_opstatus in Summary mode is now more brief. All "OK, Non-Disabled Services" (ONDS) for any given hostgroup are now aggregated in a single line. If you monitor a lot of services on each of your host groups, this will save you a lot of screen real estate. Services which are disabled and/or failing are still broken out individually. + added FAILED flag to Status box , moved DISABLED flag, so mon.cgi works with Lynx & w3m or any other text browser that supports tables (only Lynx and w3m tested, looks great with w3m by the way). + changed default path of cookie to "/" to avoid lynx complaining about "invalid cookie path". + changed alert_details to use a table, include "view downtime log" + on query_group page, turn box gray if host is disabled. + fixed a div0 bug if you have no entries in your dtlog and ask to view it + changed disabled host in query_group to sort alpha even when hosts are disabled. + alert_details function now auto-detects failure/success, doesn't need to be told which one to look for ("test service immediately" would show inconsistent results from this behavior, since it is impossible to know the results of a test before you run it!) mon.cgi v.1.35 -------------- + Downtime log viewing/querying support. + Disabled services/hosts/watches now appear as gray-colored boxes on the main display screen. This makes it easier to see what is disabled. + Fixed loadstate and savestate bugs again. These commands now work. + I finally have sort of a release process, so hopefully my releases will not be littered with formatting code that is specific to my environment, and they will run fine out of the box when you get them. + Fixed a few routines to work with changing ways Mon::Client asks you to do things. + Also, if you are logged in as an authenticated user (not the "default user", if one is defined), your username will appear on each page, so you always know who you are authenticated as. + Added a logout button. + Added ability to do "reset keepstate" as well as "reset" from the web interface. + The command bar is now 2 lines instead of one. Even on my 21" monitor, 13 buttons was too much to have on 1 line (let alone my poor 800x600 laptop LCD!). + Mon::Client::test is broken in v0.7. To make it work in the way that mon.cgi expects it to, change line 1470 in Client.pm v0.7 from: > if ($what !~ /^alert|startupalert|upalert$/) { to < if ($what !~ /^monitor|alert|startupalert|upalert$/) { mon.cgi 1.32.1.2 01-Feb 2000 ---------------------------- + Fixed loadstate and savestate to not be NOOPs. + Established a "default" user for when authentication was required but you don't want to make users log in just to list status. + Along with the default user, there is also now a "switch user" feature that offers the user the chance to re-authenticate to a user of higher privilege if they are denied the running of a command due to a lack of authorization. + Fixed HTML bugs with hardcoded colors in font and table tags scattered throughout code (patch courtesy of Martha H Greenberg <marthag@MIT.EDU>, thanks!). This makes it possible to run mon.cgi in colors other than the default scheme. mon.cgi users take note however, testing color schemes is not part of my QA process (such as it is) and so if you find something broken, let me know and I'll fix it. --- NEW FILE: README.mon.cgi --- Introduction to mon.cgi -------------------------------------------------------- This interface, along with mon itself, is available from ftp://ftp.kernel.org/pub/software/admin/mon/ Development versions of mon.cgi can be found at http://www.nam-shub.com/files/ -------------------------------------------------------- mon.cgi is a web-based GUI for mon. Its purpose is twofold: 1) To provide an easy-to-read visual display of all the status items that mon keeps track of, and 2) To provide an easy-to-use web administration interface to allow users to perform all mon administration tasks from any web browser. This package and the documentation assumes that you have at least a basic familiarity with mon. ----------------------------------------------------------------- mon.cgi v.1.52 21-May-2001 by Andrew Ryan <an...@na...> This interface, along with mon itself, is available from ftp://ftp.kernel.org/pub/software/admin/mon/ Development versions of mon.cgi can be found at http://www.nam-shub.com/files/ ----------------------------------------------------------------- This is the latest stable version of mon.cgi, meant to be used only with mon 0.38-21 and above, and a version of Mon::Client that is 0.11 or higher. The chief reason that you will need the new version is for the "test config" functionality. This release has 4 new features of note: 1) Access control. Using the 'watch' keyword in the config file, you can restrict access to a particular configuration on a per-hostgroup basis. 'watch' keywords can be regular expressions. Original idea and keyword name stolen from monshow :) 2) 'watch' keywords can either be implemented "softly" -- by default only certain hostgroups are shown, but all can be accessed -- or "strictly" -- only the hostgroups explicitly allowed by 'watch' keywords can be accessed in any way. Using strict access control, an organization using mon to watch systems belonging to multiple customers to be able to segregate those different customers' monitoring completely. 3) There's now a login button. The people have spoken! 4) mon.cgi now checks for the proper version of Mon::Client before it starts. This was a major support problem. Plus many other bug fixes and small improvements, as usual. This release should be considered stable until proven otherwise :) Please see the CHANGES file for more information about this release. Thanks to all who report bugs, submit patches, and give feedback. Andrew Ryan <an...@na...> Installing mon.cgi ------------------ Instructions for installing mon.cgi are located in the header of the mon.cgi file itself. Roughly speaking, the order of events is: 1) Install mon and get it working, set up monpasswd and auth.cf files and get them verifiably working if you're using mon.cgi authentication (hint: you should be!). 2) Install a web server, preferably Apache, and preferably with mod_perl built in. Start the web server and verify that it works. 3) Put mon.cgi in your cgi-bin directory and make sure it is executable by the apache user (make it 0755 or 0555). 4) Edit your mon.cgi file to change default values to match your environment (e.g. contact email, your company logo, your company name, etc.). 5) If you're requiring users to log in (highly recommended), you must change the default app secret variable $app_secret in your copy of mon.cgi, and install the Crypt::TripleDES module from CPAN on the machine which will be running mon.cgi. 6) If you want to easily customize the look and feel of mon.cgi, as well as various other configuration options, copy the sample mon.cgi.cf file (in the /config directory of this distribution) into a location where your webserver can read it, and edit the line beginning '$moncgi_config_file = ""' to reflect the path to your config file. You can then change the look and feel of mon.cgi, as well as implement access controls, directly from this file. mon.cgi Design Goals -------------------- 1) Provide 100% of the functionality of mon in a graphical user interface. Ideally, there will be some things that the GUI is better for, and inevitably, some things that the command line will always win out for. 2) Maintain 100% compatibility with mon and Mon::Client. If a patch to mon or Mon::Client is required to get a piece of mon.cgi functionality working, we write it, submit it, and get it folded in to the main distribution before making it official in mon.cgi. 3) Expose mon to the largest number of people possible in the most useful way. It is the author's belief that mon is a very useful piece of monitoring software, and it is also my belief that the best way to insure the growth and support of this software is to expose it to a large number of people in your organization in a way that will cause them to reach the same conclusion. A web client is the most universal way to achieve this goal at the present time, as a web client can be run on any network that mon would be. 4) Simplicity and lightness. In other words: Compatibility on a large number of client browser sizes, versions, and resolutions; No frames! ; Adhering to as many of the standard good usability conventions as possible ; Keeping mon.cgi all one file, with a very short setup time ; No special modules required past those needed to run mon, and optional additional modules kept to a minimum ; 100% text browser compatibility ; Performance and speed ; Low resource utilization. Sometimes these design goals work against one another, but hopefully we come out ahead when tradeoffs are made. Alternatives to mon.cgi ----------------------- If you don't like mon.cgi but you would still like a web GUI, you have 2 alternatives. Your first alternative is Jim's monshow, which ships with mon in the clients/ subdirectory of the mon distribution. The second alternative is Gilles Lamiral's Minotaure, which can be found at ftp://ftp.kernel.org/pub/software/admin/mon/contrib/. Both of these are fully functional and may suit your needs better than mon.cgi. You are encouraged to take a look at them both and decide which is best for you. SITE CUSTOMIZATION ------------------ mon.cgi has always been "customizable," in that the source was available and you were encouraged to substitute your own parameters (e.g., mon host, mon port, company logo, etc.). But this meant that with each new version, you had to go back and re-edit the source code. Not a big deal, but still something of a pain. As of v1.49, mon.cgi includes some features which are meant to facilitate these changes and make site-specific customizations easier to perform, especially as mon and mon.cgi continue to evolve. Creating Your Own Config File ----------------------------- Previous to v.1.49 of mon.cgi, you could customize the look of the page, but all customizations had to be done in the source itself. This has numerous disadvantages, so 1.49 introduces an *optional* config file which will be read only as necessary and will allow you to specify custom values for parameters without having to touch the source code each time. You can still edit the source each time if you want, but if you want to set up a config file, follow these steps: 1) Copy the config file (included with the mon.cgi distribution) config/mon.cgi.cf to a location of your choice. It's best to start with a sample config file, because the config file format is very simple, and it will give you a chance to see how it works and experiment with parameters. 2) Edit the mon.cgi source code to find the line that specifies the variable "$moncgi_config_file". Change the value to the filesystem path of your copy of your mon.cgi config file. 3) Now you can edit the config file and make changes at will. Every time you change the mtime of the file (e.g., by saving it in a text editor, or touch'ing the file), mon.cgi will re-read the config file and the changes will take effect. If there are errors in parsing the config file, they will go to STDERR, which in most setups will end up in your web server's error log. Look in the errors file if your config isn't working like you expect it to work. Adding A New Row And Custom Commands To The Command Button Bar -------------------------------------------------------------- Adding a new row to the command button bar, with corresponding custom commands, is quite a bit more involved than the relatively simple matter of changing a config file. If you've developed, or are interested in developing your own custom commands, however, this functionality might be just what you needed. In the following example, we add a command called "ack_all" to the button bar, and also add the routine to do the ack'ing. The actual guts of the ack_all routine aren't included, but the goal of these instructions is to give you enough to start off. The first step is to create your own moncgi_custom_print_bar function. A stub function exists in the mon.cgi code, and the below code shows you how you would put in your own function that has one button, labeled "Acknowledge All Failures". Sample moncgi_custom_print_bar subroutine: sub moncgi_custom_print_bar { # # This is a sample routine, which adds a third row to the # command table, with one command: "Acknowledge All Failures" # my ($face)= (@_); $webpage->print("<tr>\n"); $webpage->print("\t<td colspan=7 align=center><font FACE=\"$face\"><a href=$url?${monhost_and_port_args}command=ack_all>Acknowledge All Failures</a></font></td>\n"); $webpage->print("</tr>\n"); } The next step is to tell mon.cgi that you are using your own custom commands, by creating your own moncgi_custom_commands subroutine. Again, there is a sample function in the mon.cgi code which you can replace with your own. Sample moncgi_custom_commands subroutine: sub moncgi_custom_commands { if ($command eq "ack_all") { # # Set up the page # &setup_page("Acknowledge All Alarms"); # # Note: you would have to write the "ack all" # command yourself! &moncgi_ack_all; } else { # # We didn't find anything, return # return 0; } return 1; # we did find something, suppress further command processing } The last step is to create the actual subroutines which will do the custom work you want them to do (assuming you weren't just calling existing commands in a different way. In our example, this means we have to write a function that actually goes out and acks all existing failures. We won't do this here, but hopefully this gives you an idea of how to proceed. sub moncgi_ack_all { # # Here is where the actual code to do the "ack all" would go # } When future releases of mon.cgi come out, you can copy and paste your custom subroutines and be up and running with the new version in minimal time. At least, that is what this was designed for. Credits ------- The current maintainer is Andrew Ryan <an...@na...>. Report all bugs to him or the mon users mailing list. + Originally by: Arthur K. Chan <ar...@al...> + Based on the Mon program by Jim Trocki <tr...@tr...>. http://www.kernel.org/software/mon/ + Rewritten to support Mon::Client, mod_perl, taint mode, authentication, the strict pragma, and other visual/functional enhancements by Andrew Ryan <an...@na...>. + Downtime logging contributed by Martha H Greenberg <ma...@mi...> + Site customization extensions by Ed Ravin <er...@pa...> + The contributions of members of the mon-users mailing list have been invaluable in many ways. |
From: Jim T. <tr...@us...> - 2004-07-12 01:33:44
|
Update of /cvsroot/mon/mon/etc In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8582/etc Added Files: Tag: mon-1-0-0pre1 mon.cgi.cf Log Message: incorporated mon.cgi 1.52 --- NEW FILE: mon.cgi.cf --- # # The mon.cgi config file. # Format: # key = value # # Blank lines and lines that begin with '#' are ignored. # # Both key names and values are case sensitive. # # This file comes with the mon.cgi distribution and contains all of the # valid key/value pairs that mon.cgi will accept. # # The latest version of mon.cgi is always available at: # http://www.nam-shub.com/files/ # # If there are errors in your config file, mon.cgi will stop parsing it, # and will print messages to STDERR, which should end up in your web # server's error log. # # $Id: mon.cgi.cf,v 1.1.2.1 2004/07/12 01:33:32 trockij Exp $ # # Your organization (what you want printed on the top of each page) organization = Network Operations # Contact email for mon administrator at your site monadmin = bo...@yo...main #Company or mon logo (URL path) logo = /URL-path/to/your.gif # URL to go to when you click on the logo image logo_link = http://www.kernel.org/pub/software/admin/mon/html/ # Seconds between page reload reload_time = 180 # Where to run mon (host,port) monhost = localhost monport = 2583 # Set this to anything other than 'Y' or 'yes' to turn off authentication # (HINT: authentication is a *good* thing) must_login = yes # Application secret. Set this to something long and unguessable. app_secret = LKAHETOI#KJHJKSHDOWOIUW^*((985i2hkljlkjfdhglkdhfgdlkfjghldksfjhg98 34tklh qrthq3 i3lu4 KLHKLJHKLJH ncxmvn owow y YnneO87210502673kn6l3 # Default username and password (only used if must_login is set) default_username = readonly default_password = public # Idle time, in seconds, until login cookie is invalidated. Note that if # ( login_expire_time < reload_time ) you will not be able to "idle". login_expire_time = 900 # Whether or not to untaint HTML in ack msgs using HTML::Entities (recommended) untaint_ack_msgs = yes # The name of the cookie set by mon.cgi and its path cookie_name = mon-cookie cookie_path = / # Default alternate fonts to use (assumes default font is a serif font) fixed_font_face = courier sans_serif_font_face = Helvetica, Arial # Default color scheme for page BGCOLOR = black TEXTCOLOR = white LINKCOLOR = yellow VLINKCOLOR = #00FFFF # Default colors for failed services greenlight_color = #009900 redlight_color = red unchecked_color = #000033 yellowlight_color = #FF9933 # # A white-background look for mon.cgi, from Thomas Bates <cb...@tv...> # #BGCOLOR = #FFFFFF #TEXTCOLOR = #000000 #LINKCOLOR = 0000FF #VLINKCOLOR = #551a8b # #greenlight_color=#a0d0a0 #redlight_color=ff6060 #unchecked_color=f0f0f0 #disabled_color=#e0e0e0 #yellowlight_color = #FFAF4F # Maximum number of downtime events to show, per page dtlog_max_failures_per_page = 100 # Watch keywords will show only the specified hostgroups by default. # Matching is by regexp. # e.g., show the watch whose name is www #watch = www # e.g., show any watches whose names start with gw- #watch = gw-.* # Set show_watch_strict to 'yes' if you want to be sure that users only # information about the hostgroups that they are authorized to # view. If show_watch_strict is set to 1, as far as your GUI users # will know, there is nothing else running on the mon instance # except for their hostgroups, *even if those users know the names # of other hostgroups on your mon server*. # # Set to show_watch_strict to 'no' to show only the defined watch # groups by default, but allow users to see information about # others as well. show_watch_strict = no |
From: Jim T. <tr...@us...> - 2004-07-12 01:33:40
|
Update of /cvsroot/mon/mon/clients In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8582/clients Added Files: Tag: mon-1-0-0pre1 mon.cgi Log Message: incorporated mon.cgi 1.52 --- NEW FILE: mon.cgi --- #!/usr/bin/perl -T #!/usr/bin/perl -Tw broke when I made changes to list_dtlog that involved # submitting three commas ",,," in a row into the value of $args :( # # NAME # mon.cgi # # # DESCRIPTION # Web interface for the Mon resource monitoring system. mon.cgi # implements a significant subset of the Perl interface to Mon, which # allows administrators to quickly view the status of their network # and perform many common Mon tasks with a simple web client. # # Requires mon 0.38-21 and Mon::Client 0.11 for proper operation. # # # AUTHORS # Originally by: [...3793 lines suppressed...] # inside &moncgi_custom_commands. # # moncgi_custom_commands returns non-zero if it finds # a command to execute; } else { # All else. &setup_page("Operation Status: Summary View"); &query_opstatus("summary"); } $webpage->print("<hr>"); # # Some stuff we keep around for debugging # #print "commands is $command, args is $args<br>\n"; #DEBUG #print $webpage->dump; #DEBUG &end_page; $c->disconnect(); |
From: Jim T. <tr...@us...> - 2004-07-12 01:33:39
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8582 Modified Files: Tag: mon-1-0-0pre1 INSTALL Log Message: incorporated mon.cgi 1.52 Index: INSTALL =================================================================== RCS file: /cvsroot/mon/mon/INSTALL,v retrieving revision 1.1.1.1.2.3 retrieving revision 1.1.1.1.2.4 diff -C2 -d -r1.1.1.1.2.3 -r1.1.1.1.2.4 *** INSTALL 9 Jul 2004 03:18:52 -0000 1.1.1.1.2.3 --- INSTALL 12 Jul 2004 01:33:31 -0000 1.1.1.1.2.4 *************** *** 176,177 **** --- 176,181 ---- + WEB INTERFACE + ------------- + + (mention something about mon.cgi and monshow) |
From: Jim T. <tr...@us...> - 2004-07-09 13:27:43
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv30102 Modified Files: Tag: mon-1-0-0pre1 mon Log Message: handle_trap_timeout now calls process_event, and it works fine with alert/upalert/alertevery/etc. as shown by my testing Index: mon =================================================================== RCS file: /cvsroot/mon/mon/mon,v retrieving revision 1.4.2.8 retrieving revision 1.4.2.9 diff -C2 -d -r1.4.2.8 -r1.4.2.9 *** mon 9 Jul 2004 03:17:32 -0000 1.4.2.8 --- mon 9 Jul 2004 13:27:33 -0000 1.4.2.9 *************** *** 2966,2969 **** --- 2966,2974 ---- } + elsif ($type eq "T") + { + do_alert ($group, $service, $output, $exitval, $FL_TRAPTIMEOUT); + } + $sref->{"_failure_output"} = $output; } *************** *** 3967,3990 **** my $sref = \%{$watch{$group}->{$service}}; $sref->{"_trap_timer"} = $sref->{"traptimeout"}; - $sref->{"_failure_count"}++; - $sref->{"_consec_failures"}++; - $sref->{"_last_failure"} = $tmnow; - if ($sref->{"_op_status"} == $STAT_OK || - $sref->{"_op_status"} == $STAT_UNKNOWN || - $sref->{"_op_status"} == $STAT_UNTESTED) - { - $sref->{"_first_failure"} = $tmnow; - } - set_op_status ($group, $service, $STAT_FAIL); - $sref->{"_last_summary"} = "trap timeout"; - $sref->{"_last_detail"} = "trap timeout after " . $sref->{"traptimeout"} . "s at " . localtime ($tmnow) . "\n"; - shift @last_failures if (@last_failures > $CF{"MAX_KEEP"}); - push @last_failures, "$group $service $tm $sref->{_last_summary}"; - syslog ('crit', "failure for $last_failures[-1]"); ! do_alert ($group, $service, "$sref->{_last_summary}\n$sref->{_last_detail}", ! 0, $FL_TRAPTIMEOUT); ! ! $sref->{"_failure_output"} = "$sref->{_last_summary}\n$sref->{_last_detail}"; } --- 3972,3979 ---- my $sref = \%{$watch{$group}->{$service}}; $sref->{"_trap_timer"} = $sref->{"traptimeout"}; ! process_event ("T", $group, $service, 1, ! "trap timeout\n" . ! "trap timeout after " . $sref->{"traptimeout"} . "s at " . localtime ($tmnow) . "\n"); } |
From: Jim T. <tr...@us...> - 2004-07-09 03:19:08
|
Update of /cvsroot/mon/mon/doc In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv2708/doc Modified Files: Tag: mon-1-0-0pre1 mon.8 Log Message: some clarifications in INSTALL documentation submitted by Eric Sorenson <er...@tr...> minor addition to alert behavior explanation in mon.8 in dialin.monitor.wrap.c, return the exit status of execv (if it fails, that is) fixed path to perl in file_change.monitor and smtp3.monitor Index: mon.8 =================================================================== RCS file: /cvsroot/mon/mon/doc/mon.8,v retrieving revision 1.1.1.1.2.1 retrieving revision 1.1.1.1.2.2 diff -C2 -d -r1.1.1.1.2.1 -r1.1.1.1.2.2 *** mon.8 17 Jun 2004 20:28:54 -0000 1.1.1.1.2.1 --- mon.8 9 Jul 2004 03:18:53 -0000 1.1.1.1.2.2 *************** *** 357,361 **** unless .B no_comp_alerts ! is defined in the period section. If an alert was already sent within the last .B alertevery --- 357,362 ---- unless .B no_comp_alerts ! is defined in the period section. An upalert will only be sent ! if the previous state is a failure. If an alert was already sent within the last .B alertevery |
From: Jim T. <tr...@us...> - 2004-07-09 03:19:08
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv2708 Modified Files: Tag: mon-1-0-0pre1 INSTALL Log Message: some clarifications in INSTALL documentation submitted by Eric Sorenson <er...@tr...> minor addition to alert behavior explanation in mon.8 in dialin.monitor.wrap.c, return the exit status of execv (if it fails, that is) fixed path to perl in file_change.monitor and smtp3.monitor Index: INSTALL =================================================================== RCS file: /cvsroot/mon/mon/INSTALL,v retrieving revision 1.1.1.1.2.2 retrieving revision 1.1.1.1.2.3 diff -C2 -d -r1.1.1.1.2.2 -r1.1.1.1.2.3 *** INSTALL 18 Jun 2004 14:40:10 -0000 1.1.1.1.2.2 --- INSTALL 9 Jul 2004 03:18:52 -0000 1.1.1.1.2.3 *************** *** 1,61 **** $Id$ ! INSTALLATION ! ------------ ! ! Several parts: ! ! 1. mon, the server ! 2. Mon::Client, the Perl library used by some clients. 3. C programs in mon.d ! REQUIREMENTS ! ------------ ! The "mon" daemon uses Perl 5.n, where n >= 005_01. Older versions of Perl had ! problems with Sys::Syslog under Linux, and had dated versions of ! Text::ParseWords. Mon also requires that *.ph be created from the system ! header files. If you're using a pre-packaged Perl (such as from RedHat) then ! this has been done for you already. Otherwise, this is done manually during ! Perl installation by these means: cd /usr/include ! h2ph *.h sys/*.h ! However, if you're running Linux you may need to run ! cd /usr/include ! h2ph *.h sys/*.h asm/*.h - If you try to run mon and Perl complains with the "did you run h2ph?" - message, then chances are this step wasn't done. ! You'll need the following modules for the server to function, all of ! which are available from your nearest CPAN archive, or the place ! where you got mon: ! -Time::Period (the one written by Patrick Ryan) ! -Time::HiRes ! -Convert::BER ! -Mon::* All of the monitor and alert scripts that are packaged with mon are actually *optional*. However, this is what you'll need for each special ! monitor: ! freespace.monitor ! The disk space monitor requires the "Filesys::DiskSpace" Perl ! module from CPAN. ! fping.monitor ! Requires the "fping" code, probably available from the same ! place that you got this package. ! telnet.monitor ! This requires the Net::Telnet Perl module, available from ! CPAN. reboot.monitor --- 1,87 ---- $Id$ ! OVERVIEW ! -------- ! There are several components you'll need to get working to ! have a fully functional mon installation. + 1. mon, the server + 2. Mon::Client, the Perl library used by some clients 3. C programs in mon.d + 4. Optional (but highly useful) monitors + 5. A customized mon.cf to make the server do what you want ! 1. MON SERVER ! ------------- ! The "mon" daemon uses Perl 5.n, where n >= 005_01. ! ! Mon requires that *.ph be created from the system header files. If you try to ! run mon and Perl complains with the "did you run h2ph?" message, then chances ! are this step wasn't done, either by your package manager or manually after ! Perl installation. You can fix it by doing the following, as root: cd /usr/include ! h2ph -r -l . ! You'll need the following modules for the server to function, all of ! which are available from your nearest CPAN archive. The listed ! CPAN paths relative to /cpan/modules/by-authors/id/ -- versions of ! modules on CPAN change quickly, so there may be newer versions available, ! but the following are known to work: ! Time::Period PRYAN/Period-1.20.tar.gz ! Time::HiRes J/JH/JHI/Time-HiRes-1.59.tar.gz ! Convert::BER G/GB/GBARR/Convert-BER-1.3101.tar.gz ! 2. INSTALLING THE PERL CLIENT MODULE ! ------------------------------------ ! The Perl client module is distributed as a separate package. It is named ! "mon-client-*.tar.gz". Refer to that for installation instructions. ! It is available on kernel.org mirrors in the /pub/software/admin/mon directory, ! and in CVS on sourceforge.net. Be sure to match the version of mon-client with ! the version of mon you are using. At this time, branch "mon-1-0-0pre1" of the ! mon CVS module matches the "mon-client-1-0-0pre1" branch of the mon-client CVS ! module. See http://sourceforge.net/projects/mon/ for information on CVS access. ! ! ! 3. COMPILING THE C CODE (optional) ! ---------------------------------- ! ! Some of the monitors included with mon are written in C and need to ! be compiled for your system. If you want to use the RPC monitor or the ! dialin.monitor wrapper, ! ! cd mon.d ! (edit Makefile) ! make ! make install ! cd .. ! ! Keep in mind that although this is known to work on Linux, Solaris, and AIX, ! it may not compile on your system. It is not required for the operation of mon ! itself. ! ! ! 4. MONITORS ! ----------- All of the monitor and alert scripts that are packaged with mon are actually *optional*. However, this is what you'll need for each special ! monitor, with CPAN paths relative to /cpan/modules/by-author/id/ ! freespace.monitor - requires Filesys::Diskspace from CPAN, ! in FTASSIN/Filesys-DiskSpace-0.05.tar.gz ! ! fping.monitor - requires the 'fping' binary, from http://www.fping.com ! RPM packages available at http://dag.wieers.com/packages/fping/ ! telnet.monitor - requires the Net::Telnet from CPAN, ! in J/JR/JROGERS/Net-Telnet-3.03.tar.gz reboot.monitor *************** *** 64,78 **** process.monitor hpnp.monitor ! All use the UCD SNMP 3.6.3, along with G.S. Marzot's ! Perl module. ! ! ldap.monitor ! requires the Net::LDAPapi Perl module, available from CPAN. ! dialin.monitor ! requires the Perl Expect module, available from CPAN. ! dns.monitor ! requires the Net::DNS Perl module. msql-mysql.monitor --- 90,104 ---- process.monitor hpnp.monitor ! Use the 'net-snmp' package (formerly UCD SNMP), from ! http://sourceforge.net/projects/net-snmp ! with G.S. Marzot's Perl module G/GS/GSM/SNMP-4.2.0.tar.gz ! ldap.monitor - requires Net::LDAPapi from CPAN, ! CDONLEY/Net-LDAPapi-1.42.tar.gz ! dialin.monitor - requires the Perl Expect module from CPAN, ! R/RG/RGIERSIG/Expect-1.15.tar.gz ! dns.monitor - requires Net::DNS from CPAN, ! C/CR/CREIN/Net-DNS-0.47.tar.gz msql-mysql.monitor *************** *** 83,110 **** details. ! ! 1. INSTALLING THE PERL CLIENT MODULE ! ------------------------------------ ! ! The Perl client module is distributed as a separate package. It is named ! "mon-client-*.tar.gz". Refer to that for installation instructions. This ! module is available in CPAN (http://www.perl.com/CPAN/), on kernel.org mirrors ! in the /pub/software/admin/mon directory, and in CVS on sourceforge.net. ! Be sure to match the version of mon-client with the version of mon you ! are using. ! ! If you are using a CVS release of the mon server, you will want ! to be sure to match it with the corresponding version from the ! "mon-client" module. At this time, branch "mon-1-0-0pre1" of the ! mon CVS module matches the "mon-client-1-0-0pre1" branch of the ! mon-client CVS module. See http://sourceforge.net/projects/mon/ for ! information on CVS access. ! ! ! 2. MON SERVER INSTALLATION ! -------------------------- -Read the man page for "mon" and "moncmd" in the doc/ directory to get ! an overview about the directories involved, i.e. the configuration, alert, monitors, state, and run directories. --- 109,117 ---- details. ! 5. MON.CF CUSTOMIZATION AND STARTUP ! ----------------------------------- -Read the man page for "mon" and "moncmd" in the doc/ directory to get ! an overview of the directories involved, i.e. the configuration, alert, monitors, state, and run directories. *************** *** 169,182 **** - 3. COMPILING THE C CODE (optional) - ---------------------------------- - - -cd mon.d - (edit Makefile) - make - make install - cd .. - - to build the RPC monitor and the dialin.monitor wrapper. Keep in mind - that if this may fail for some reason (it works under Linux, Solaris, - and AIX), it is not required for the operation of mon itself. --- 176,177 ---- |
From: Jim T. <tr...@us...> - 2004-07-09 03:19:08
|
Update of /cvsroot/mon/mon/mon.d In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv2708/mon.d Modified Files: Tag: mon-1-0-0pre1 dialin.monitor.wrap.c file_change.monitor smtp3.monitor Log Message: some clarifications in INSTALL documentation submitted by Eric Sorenson <er...@tr...> minor addition to alert behavior explanation in mon.8 in dialin.monitor.wrap.c, return the exit status of execv (if it fails, that is) fixed path to perl in file_change.monitor and smtp3.monitor Index: file_change.monitor =================================================================== RCS file: /cvsroot/mon/mon/mon.d/file_change.monitor,v retrieving revision 1.1.1.1 retrieving revision 1.1.1.1.2.1 diff -C2 -d -r1.1.1.1 -r1.1.1.1.2.1 *** file_change.monitor 9 Jun 2004 05:18:05 -0000 1.1.1.1 --- file_change.monitor 9 Jul 2004 03:18:53 -0000 1.1.1.1.2.1 *************** *** 1,3 **** ! #!/usr/local/bin/perl # # mon monitor to watch for file changes --- 1,3 ---- ! #!/usr/bin/perl # # mon monitor to watch for file changes *************** *** 130,136 **** $StateFile = "$ENV{MON_STATEDIR}/$StateFile"; ! $CI = '/usr/bin/ci'; ! $CI = '/usr/local/bin/ci'; ! #$CI = 'ci'; # Assume that RCS's ci is in the path print "Will use RCS: $RCS\n" if $Debug; --- 130,134 ---- $StateFile = "$ENV{MON_STATEDIR}/$StateFile"; ! $CI = 'ci'; # Assume that RCS's ci is in the path print "Will use RCS: $RCS\n" if $Debug; Index: dialin.monitor.wrap.c =================================================================== RCS file: /cvsroot/mon/mon/mon.d/dialin.monitor.wrap.c,v retrieving revision 1.1.1.1 retrieving revision 1.1.1.1.2.1 diff -C2 -d -r1.1.1.1 -r1.1.1.1.2.1 *** dialin.monitor.wrap.c 9 Jun 2004 05:18:05 -0000 1.1.1.1 --- dialin.monitor.wrap.c 9 Jul 2004 03:18:53 -0000 1.1.1.1.2.1 *************** *** 26,29 **** /* exec */ ! execv (real_img, argv); } --- 26,29 ---- /* exec */ ! return execv (real_img, argv); } Index: smtp3.monitor =================================================================== RCS file: /cvsroot/mon/mon/mon.d/smtp3.monitor,v retrieving revision 1.1.1.1.2.1 retrieving revision 1.1.1.1.2.2 diff -C2 -d -r1.1.1.1.2.1 -r1.1.1.1.2.2 *** smtp3.monitor 22 Jun 2004 15:31:00 -0000 1.1.1.1.2.1 --- smtp3.monitor 9 Jul 2004 03:18:53 -0000 1.1.1.1.2.2 *************** *** 1,3 **** ! #!/usr/local/bin/perl # Yet another smtp monitor using IO::Socket with timing and logging --- 1,3 ---- ! #!/usr/bin/perl # Yet another smtp monitor using IO::Socket with timing and logging |
From: Jim T. <tr...@us...> - 2004-07-09 03:17:47
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv2645 Modified Files: Tag: mon-1-0-0pre1 mon Log Message: received traps now reset the trap timeout counter, and fixed some other stuff wrt trap timeouts added sub process_event and made proc_cleanup and handle_trap use it so that the alert mgmt code is shared rather than in two places. i tested as much of it as i could and all seems to work well now, especially upalert, alertafter, alertevery with traps. added per-service "_monitor_duration" variable which records how many seconds the previous monitor took to execute. this is available via "list opstatus". if no monitor has executed yet then the value is -1. added per-service "_monitor_running" variable whose value is 0 or 1 depending on whether the monitor is currently running for that service. removed gunk from handle_trap regarding the various TRAP_COLDSTART, etc. processing, since most of it was a bad idea anyway, or at least as far as i could tell. traps and their exit values are now processed exactly as monitors are, which simplifies things greatly and adds to more intuitive functionality. this means the "spc" value in a trap is now ignored. fixed some args processing in call_alert Index: mon =================================================================== RCS file: /cvsroot/mon/mon/mon,v retrieving revision 1.4.2.7 retrieving revision 1.4.2.8 diff -C2 -d -r1.4.2.7 -r1.4.2.8 *** mon 28 Jun 2004 20:10:21 -0000 1.4.2.7 --- mon 9 Jul 2004 03:17:32 -0000 1.4.2.8 *************** *** 96,99 **** --- 96,100 ---- sub pam_conv_func; sub proc_cleanup; + sub process_event; sub randomize_startdelay; sub read_cf; *************** *** 389,394 **** $sref->{"_trap_timer"} -= $t; ! if ($sref->{"_trap_timer"} <= 0 && $tm - $sref->{"_last_uptrap"} > ! $sref->{"traptimeout"}) { handle_trap_timeout ($group, $service); } --- 390,395 ---- $sref->{"_trap_timer"} -= $t; ! if ($sref->{"_trap_timer"} <= 0) ! { handle_trap_timeout ($group, $service); } *************** *** 508,511 **** --- 509,514 ---- my ($sref, $range, @alerts); + debug (1, "do_alert flags=$flags\n"); + $sref = \%{$watch{$group}->{$service}}; *************** *** 1207,1210 **** --- 1210,1219 ---- $sref->{"_exitval"} = "undef" if (!defined($sref->{"_exitval"})); $sref->{"_last_check"} = undef; + # + # -1 for _monitor_duration means no monitor has been run yet + # so there is no duration data available + # + $sref->{"_monitor_duration"} = -1; + $sref->{"_monitor_running"} = 0; $sref->{"_depend_status"} = undef; $sref->{"failure_interval"} = undef; *************** *** 2678,2683 **** if ($sref->{"_last_failure"}); ! $buf .= " interval=$sref->{interval}" ! if ($sref->{"interval"}); $buf .= " exclude_period='$sref->{exclude_period}'" --- 2687,2697 ---- if ($sref->{"_last_failure"}); ! ! if ($sref->{"interval"}) ! { ! $buf .= " interval=$sref->{interval}" . ! " monitor_duration=$sref->{_monitor_duration}" . ! " monitor_running=$sref->{_monitor_running}" ! } $buf .= " exclude_period='$sref->{exclude_period}'" *************** *** 2864,3011 **** } ! $sref->{"_exitval"} = int($?>>8); ! debug (1, "PID $p ($runningpid{$p}) exited with [$sref->{'_exitval'}]\n"); ! $sref->{"_last_checked"} = $tmnow; ! if ($sref->{"depend"} ne "" && ! $sref->{"dep_behavior"} eq "a") ! { ! dep_ok ($sref); ! } # ! # error exit value # ! if ($?) { ! # ! # accounting ! # ! $sref->{"_failure_count"}++; ! $sref->{"_consec_failures"}++; ! $sref->{"_last_failure"} = $tmnow; ! if ($sref->{"_op_status"} == $STAT_OK || ! $sref->{"_op_status"} == $STAT_UNKNOWN || ! $sref->{"_op_status"} == $STAT_UNTESTED) ! { ! $sref->{"_first_failure"} = $tmnow; ! } ! set_op_status ($group, $service, $STAT_FAIL); ! my ($summary, $detail) = split("\n", $ibufs{$runningpid{$p}}, 2); ! $summary = "(NO SUMMARY)" if ($summary =~ /^\s*$/m); ! $sref->{"_last_summary"} = $summary; ! $sref->{"_last_detail"} = $detail; ! shift @last_failures if (@last_failures > $CF{"MAX_KEEP"}); ! push @last_failures, "$group $service" . ! " $tm $summary"; ! syslog ('crit', "failure for $last_failures[-1]"); ! # ! # send an alert if necessary ! # ! do_alert ($group, $service, $ibufs{$runningpid{$p}}, ! $?>>8, $FL_MONITOR); # # change interval if needed # if (defined ($sref->{"failure_interval"}) && ! $sref->{"_old_interval"} == undef) { ! $sref->{"_old_interval"} = $sref->{"interval"}; $sref->{"interval"} = $sref->{"failure_interval"}; $sref->{"_next_check"} = 0; } - - $sref->{"_failure_output"} = $ibufs{$runningpid{$p}}; } ! # ! # success exit value ! # ! else { ! if ($CF{"DTLOGGING"} && defined ($sref->{"_op_status"}) && ! $sref->{"_op_status"} == $STAT_FAIL) ! { ! write_dtlog ($sref, $group, $service); ! } ! my $old_status = $sref->{"_op_status"}; ! set_op_status ($group, $service, $STAT_OK); ! # ! # if this service has just come back up and ! # we are paying attention to this event, ! # let someone know ! # ! if ((defined ($sref->{"_op_status"})) && ! ($old_status == $STAT_FAIL) && ! (defined($sref->{"_upalert"})) && ! (!defined($sref->{"upalertafter"}) ! || (($tmnow - $sref->{"_first_failure"}) >= $sref->{"upalertafter"}))) ! { ! do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); ! } ! # ! # send also when no upalertafter set ! # ! elsif (defined($sref->{"_upalert"})) ! { ! do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); ! } ! $sref->{"_ack"} = 0; ! $sref->{"_ack_comment"} = ''; ! $sref->{"_first_failure"} = 0; ! $sref->{"_last_failure"} = 0; ! $sref->{"_consec_failures"} = 0; ! my ($summary, $detail) = split("\n", $ibufs{$runningpid{$p}}, 2); ! $sref->{"_last_summary"} = $summary; ! $sref->{"_last_detail"} = $detail; ! # ! # reset the alertevery timer ! # ! foreach my $period (keys %{$sref->{"periods"}}) ! { ! # ! # "alertevery strict" should not reset _last_alert ! # ! if (!$sref->{"periods"}->{$period}->{"_alertevery_strict"}) ! { ! $sref->{"periods"}->{$period}->{"_last_alert"} = 0; ! } ! $sref->{"periods"}->{$period}->{"_1stfailtime"} = 0; ! $sref->{"periods"}->{$period}->{"_alert_sent"} = 0; ! } # ! # change interval back to original # ! if (defined ($sref->{"failure_interval"}) && ! $sref->{"_old_interval"} != undef) { ! $sref->{"interval"} = $sref->{"_old_interval"}; ! $sref->{"_old_interval"} = undef; ! $sref->{"_next_check"} = 0; } ! $sref->{"_last_success"} = $tmnow; ! } # ! # save the output # ! $sref->{"_last_output"} = $ibufs{$runningpid{$p}}; ! reset_timer ($group, $service); - remove_proc ($p); } } --- 2878,3057 ---- } ! debug (1, "PID $p ($runningpid{$p}) exited with [" . int ($?>>8) . "]\n"); ! $sref->{"_monitor_duration"} = $tmnow - $sref->{"_last_check"}; ! $sref->{"_monitor_running"} = 0; ! ! process_event ("m", $group, $service, int ($?>>8), $ibufs{$runningpid{$p}}); + reset_timer ($group, $service); + + remove_proc ($p); + } + } + + + # + # handle the event where a monitor exits or a trap is received + # + # $type is "m" for monitor, "t" for trap + # + sub process_event { + my ($type, $group, $service, $exitval, $output) = @_; + + debug (1, "process_event type=$type group=$group service=$service exitval=$exitval output=[$output]\n"); + + my $sref = \%{$watch{$group}->{$service}}; + my $tmnow = time; + + my ($summary, $detail) = split("\n", $output, 2); + + $sref->{"_exitval"} = $exitval; + + if ($sref->{"depend"} ne "" && + $sref->{"dep_behavior"} eq "a") + { + dep_ok ($sref); + } + + # + # error exit value + # + if ($exitval) + { # ! # accounting # ! $sref->{"_failure_count"}++; ! $sref->{"_consec_failures"}++; ! $sref->{"_last_failure"} = $tmnow; ! if ($sref->{"_op_status"} == $STAT_OK || ! $sref->{"_op_status"} == $STAT_UNKNOWN || ! $sref->{"_op_status"} == $STAT_UNTESTED) { ! $sref->{"_first_failure"} = $tmnow; ! } ! set_op_status ($group, $service, $STAT_FAIL); ! $summary = "(NO SUMMARY)" if ($summary =~ /^\s*$/m); ! $sref->{"_last_summary"} = $summary; ! $sref->{"_last_detail"} = $detail; ! shift @last_failures if (@last_failures > $CF{"MAX_KEEP"}); ! push @last_failures, "$group $service" . ! " $tm $summary"; ! syslog ('crit', "failure for $last_failures[-1]"); + # + # send an alert if necessary + # + if ($type eq "m") + { + do_alert ($group, $service, $output, $exitval, $FL_MONITOR); # # change interval if needed # if (defined ($sref->{"failure_interval"}) && ! $sref->{"_old_interval"} == undef) { ! $sref->{"_old_interval"} = $sref->{"interval"}; $sref->{"interval"} = $sref->{"failure_interval"}; $sref->{"_next_check"} = 0; } } ! elsif ($type eq "t") { ! do_alert ($group, $service, $output, $exitval, $FL_TRAP); ! } ! $sref->{"_failure_output"} = $output; ! } ! # ! # success exit value ! # ! else ! { ! if ($CF{"DTLOGGING"} && defined ($sref->{"_op_status"}) && ! $sref->{"_op_status"} == $STAT_FAIL) ! { ! write_dtlog ($sref, $group, $service); ! } ! my $old_status = $sref->{"_op_status"}; ! set_op_status ($group, $service, $STAT_OK); ! if ($type eq "t") ! { ! $sref->{"_last_uptrap"} = $tmnow; ! } ! # ! # if this service has just come back up and ! # we are paying attention to this event, ! # let someone know ! # ! if ((defined ($sref->{"_op_status"})) && ! ($old_status == $STAT_FAIL) && ! (defined($sref->{"_upalert"})) && ! (!defined($sref->{"upalertafter"}) ! || (($tmnow - $sref->{"_first_failure"}) >= $sref->{"upalertafter"}))) ! { ! do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); ! } ! # ! # send also when no upalertafter set ! # ! elsif (defined($sref->{"_upalert"}) && $old_status == $STAT_FAIL) ! { ! do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); ! } ! ! $sref->{"_ack"} = 0; ! $sref->{"_ack_comment"} = ''; ! $sref->{"_first_failure"} = 0; ! $sref->{"_last_failure"} = 0; ! $sref->{"_consec_failures"} = 0; + # + # reset the alertevery timer + # + foreach my $period (keys %{$sref->{"periods"}}) + { # ! # "alertevery strict" should not reset _last_alert # ! if (!$sref->{"periods"}->{$period}->{"_alertevery_strict"}) { ! $sref->{"periods"}->{$period}->{"_last_alert"} = 0; } ! $sref->{"periods"}->{$period}->{"_1stfailtime"} = 0; ! $sref->{"periods"}->{$period}->{"_alert_sent"} = 0; } # ! # change interval back to original # ! if (defined ($sref->{"failure_interval"}) && ! $sref->{"_old_interval"} != undef) ! { ! $sref->{"interval"} = $sref->{"_old_interval"}; ! $sref->{"_old_interval"} = undef; ! $sref->{"_next_check"} = 0; ! } ! $sref->{"_last_success"} = $tmnow; } + + # + # save the output + # + $sref->{"_last_output"} = $output; + $sref->{"_last_summary"} = $summary; + $sref->{"_last_detail"} = $detail; } *************** *** 3162,3165 **** --- 3208,3212 ---- $sref->{"_last_check"} = scalar (time); + $sref->{"_monitor_running"} = 1; debug (1, "watching file handle ", fileno ($fhandles{"$group/$service"}), *************** *** 3729,3736 **** my $time = time; - my $noalert = 0; my %trap = (); my $flags = 0; my $tmnow = time; # --- 3776,3784 ---- my $time = time; my %trap = (); my $flags = 0; my $tmnow = time; + my $intended; + my $fromip; # *************** *** 3741,3750 **** # pas password # typ type ("failure", "up", "startup", "trap", "traptimeout") ! # spc specific type (TRAP_*) # seq sequence # grp group # svc service # hst host ! # sta status (opstatus) # tsp timestamp as time(2) value # sum summary output --- 3789,3798 ---- # pas password # typ type ("failure", "up", "startup", "trap", "traptimeout") ! # spc specific type (STAT_OK, etc.) THIS IS NO LONGER USED # seq sequence # grp group # svc service # hst host ! # sta status (same as exit status of a monitor) # tsp timestamp as time(2) value # sum summary output *************** *** 3752,3877 **** # ! foreach my $line (split (/\n/, $buf)) { ! if ($line =~ /^(\w+)=(.*)/) { ! my $trap_name = $1; ! my $trap_val = $2; ! chomp $trap_val; ! $trap_val =~ s/^\'(.*)\'$/\1/; ! $trap{$trap_name} = un_esc_str ($trap_val); } else { ! syslog ('err', "unspecified tag in trap: $line"); } - } - - $trap{"sum"} = "$trap{sum}\n" if ($trap{"sum"} !~ /\n$/); ! my ($port, $addr) = sockaddr_in ($from); ! my $fromip = inet_ntoa ($addr); ! # ! # trap authentication ! # ! my ($traphost, $trapuser, $trappass); ! if (defined ($AUTHTRAPS{"*"})) ! { ! $traphost = "*"; ! } ! ! else ! { ! $traphost = $fromip; ! } ! if (defined ($AUTHTRAPS{$traphost}{"*"})) ! { ! $trapuser = "*"; ! $trappass = ""; ! } ! else ! { ! $trapuser = $trap{"usr"}; ! $trappass = $trap{"pas"}; ! } ! if (!defined ($AUTHTRAPS{$traphost})) ! { ! syslog ('err', "received trap from unauthorized host: $fromip"); ! return undef; ! } ! if ($trapuser ne "*" && $AUTHTRAPS{$traphost}{$trapuser} && ! crypt ($trappass, $AUTHTRAPS{$traphost}{$trapuser}) ne ! $AUTHTRAPS{$traphost}{$trapuser}) ! { ! syslog ('err', "received trap from unauthorized user $trapuser, host $traphost"); ! return undef; ! } ! # ! # protocol version ! # ! if ($trap{"pro"} < $TRAP_PRO_VERSION) ! { ! syslog ('err', "cannot handle traps from version less than $TRAP_PRO_VERSION"); ! return undef; ! } ! # ! # validate trap type ! # ! if (!defined $trap{"typ"} || !defined ($trap{"spc"})) ! { ! syslog ('err', "no trap type specified from $fromip"); ! return undef; } # ! # validate trap type ! # ! ! # ! # if mon receives a trap for an unknown group/service, then the ! # default/default group/service should catch these if it is defined # - - my $intended; - if (!defined $watch{$trap{"grp"}} && defined $watch{"default"}) - { - $trap{"grp"} = "default"; - } - - if ((!defined ($groups{$trap{"grp"}}) && - !defined $watch{$trap{"grp"}}->{$trap{"svc"}}) && - (defined($groups{'default'}) && - defined($watch{'default'}->{'default'}))) - { - $intended = "$trap{'grp'}:$trap{'svc'}"; - $trap{"grp"} = "default"; - $trap{"svc"} = "default"; - } - - if (!defined ($groups{$trap{"grp"}})) - { - syslog ('err', "trap received for undefined group $trap{grp}"); - return; - } - - elsif (!defined $watch{$trap{"grp"}}->{$trap{"svc"}}) - { - syslog ('err', "trap received for undefined service type $trap{grp}/$trap{svc}"); - return; - } - my $sref = \%{$watch{$trap{"grp"}}->{$trap{"svc"}}}; - $sref->{"_last_trap"} = $time; - $sref->{"_last_detail"} = $trap{"dtl"}; - $sref->{"_last_summary"} = $trap{"sum"}; # --- 3800,3925 ---- # ! # ! # this part validates the trap ! # ! if (1) { ! foreach my $line (split (/\n/, $buf)) { ! if ($line =~ /^(\w+)=(.*)/) ! { ! my $trap_name = $1; ! my $trap_val = $2; ! chomp $trap_val; ! $trap_val =~ s/^\'(.*)\'$/\1/; ! $trap{$trap_name} = un_esc_str ($trap_val); ! } ! ! else ! { ! syslog ('err', "unspecified tag in trap: $line"); ! } ! } ! ! $trap{"sum"} = "$trap{sum}\n" if ($trap{"sum"} !~ /\n$/); ! ! my ($port, $addr) = sockaddr_in ($from); ! my $fromip = inet_ntoa ($addr); ! ! # ! # trap authentication ! # ! my ($traphost, $trapuser, $trappass); ! ! if (defined ($AUTHTRAPS{"*"})) ! { ! $traphost = "*"; } else { ! $traphost = $fromip; } ! if (defined ($AUTHTRAPS{$traphost}{"*"})) ! { ! $trapuser = "*"; ! $trappass = ""; ! } ! else ! { ! $trapuser = $trap{"usr"}; ! $trappass = $trap{"pas"}; ! } ! if (!defined ($AUTHTRAPS{$traphost})) ! { ! syslog ('err', "received trap from unauthorized host: $fromip"); ! return undef; ! } ! if ($trapuser ne "*" && $AUTHTRAPS{$traphost}{$trapuser} && ! crypt ($trappass, $AUTHTRAPS{$traphost}{$trapuser}) ne ! $AUTHTRAPS{$traphost}{$trapuser}) ! { ! syslog ('err', "received trap from unauthorized user $trapuser, host $traphost"); ! return undef; ! } ! # ! # protocol version ! # ! if ($trap{"pro"} < $TRAP_PRO_VERSION) ! { ! syslog ('err', "cannot handle traps from version less than $TRAP_PRO_VERSION"); ! return undef; ! } ! # ! # validate trap type ! # ! if (!defined $trap{"sta"}) ! { ! syslog ('err', "no trap sta value specified from $fromip"); ! return undef; ! } ! # ! # if mon receives a trap for an unknown group/service, then the ! # default/default group/service should catch these if it is defined ! # ! if (!defined $watch{$trap{"grp"}} && defined $watch{"default"}) ! { ! $trap{"grp"} = "default"; ! } ! if ((!defined ($groups{$trap{"grp"}}) && ! !defined $watch{$trap{"grp"}}->{$trap{"svc"}}) && ! (defined($groups{'default'}) && ! defined($watch{'default'}->{'default'}))) ! { ! $intended = "$trap{'grp'}:$trap{'svc'}"; ! $trap{"grp"} = "default"; ! $trap{"svc"} = "default"; ! } ! if (!defined ($groups{$trap{"grp"}})) ! { ! syslog ('err', "trap received for undefined group $trap{grp}"); ! return; ! } ! ! elsif (!defined $watch{$trap{"grp"}}->{$trap{"svc"}}) ! { ! syslog ('err', "trap received for undefined service type $trap{grp}/$trap{svc}"); ! return; ! } } # ! # trap has been validated, proceed # my $sref = \%{$watch{$trap{"grp"}}->{$trap{"svc"}}}; # *************** *** 3880,3886 **** if (exists $sref->{"traptimeout"}) { ! $sref->{"_trap_timer"} = $sref->{"traptimeout"}; } if ($intended) { --- 3928,3936 ---- if (exists $sref->{"traptimeout"}) { ! $sref->{"_trap_timer"} = $sref->{"traptimeout"}; } + $sref->{"_last_trap"} = $time; + if ($intended) { *************** *** 3888,4016 **** } - my $old_status = $sref->{"_op_status"}; - syslog ('info', "trap $trap{typ} $trap{spc} from " . "$fromip for $trap{grp} $trap{svc}, status $trap{sta}"); ! my $group = $trap{"grp"}; ! my $service = $trap{"svc"}; ! ! # ! # Not sure what I want to do with this. It's not done, and ! # just because it's here doesn't mean that it is meant to work ! # how it is coded. ! # ! if (1) ! { ! if ($trap{"spc"} == $STAT_COLDSTART) ! { ! set_op_status ($group, $service, $STAT_COLDSTART); ! $sref->{"_trap_duration_timer"} = $sref->{"trapduration"} ! if ($sref->{"trapduration"}); ! } ! ! elsif ($trap{"spc"} == $STAT_WARMSTART) ! { ! set_op_status ($group, $service, $STAT_WARMSTART); ! $sref->{"_trap_duration_timer"} = $sref->{"trapduration"} ! if ($sref->{"trapduration"}); ! $sref->{"_last_uptrap"} = $time; ! } ! ! elsif ($trap{"spc"} == $STAT_LINKDOWN) ! { ! set_op_status ($group, $service, $STAT_LINKDOWN); ! $sref->{"_failure_count"}++; ! $sref->{"_first_failure"} = $tm if ($sref->{"_op_status"} != $STAT_FAIL); ! $sref->{"_trap_duration_timer"} = $sref->{"trapduration"} ! if ($sref->{"trapduration"}); ! } ! ! elsif ($trap{"spc"} == $STAT_OK) ! { ! if ($CF{"DTLOGGING"} && defined ($sref->{"_op_status"}) && ! $sref->{"_op_status"} == $STAT_FAIL) ! { ! write_dtlog ($sref, $group, $service); ! } ! ! set_op_status ($group, $service, $STAT_OK); ! $sref->{"_last_uptrap"} = $time; ! $sref->{"_trap_duration_timer"} = $sref->{"trapduration"} ! if ($sref->{"trapduration"}); ! } ! ! elsif ($trap{"spc"} == $STAT_FAIL) ! { ! set_op_status ($group, $service, $STAT_FAIL); ! $sref->{"_first_failure"} = $tm if ($sref->{"_op_status"} != $STAT_FAIL); ! $sref->{"_trap_duration_timer"} = $sref->{"trapduration"} ! if ($sref->{"trapduration"}); ! } ! ! elsif ($trap{"spc"} == $STAT_WARN) ! { ! set_op_status ($group, $service, $STAT_WARN); ! ! # } elsif ($trap{"spc"} == $STAT_HEARTBEAT) { ! # set_op_status ($group, $service, $STAT_OK); ! # $sref->{"_last_uptrap"} = $time; ! # $noalert++; ! } ! ! else ! { ! syslog ('err', "trap received from $fromip" . ! " for undefined type $trap{typ} $trap{spc} $trap{grp}"); ! return; ! } ! } ! ! shift @last_failures if (@last_failures > $CF{"MAX_KEEP"}); ! ! push @last_failures, "$trap{grp} $trap{svc}" . ! " $tm $trap{typ} $trap{spc} $trap{sum}"; ! ! if ($sref->{"depend"} ne "" && ! $sref->{"dep_behavior"} eq "a") ! { ! dep_ok ($sref); ! } ! ! # ! # if trap is FAIL, send an alert ! # if trap is OK send upalert ! # upalert only gets sent if an upalert for this ! # trap is actually defined, and if the ! # upalertafter config is satisfied ! # ! ! $flags = 0; ! ! if ( $trap{"spc"} == $STAT_OK ) { ! ! $flags = $FL_UPALERT; ! ! if ( defined($sref->{"_upalert"}) ) { ! if ( $tmnow - $sref->{"_first_failure"} < ! $sref->{"upalertafter"}) ! { ! $noalert++; ! } ! } ! else { ! $noalert++; ! } ! } ! #### else just fall through and send alert ! do_alert ( ! $trap{"grp"}, ! $trap{"svc"}, ! $trap{"sum"} . $trap{"dtl"}, ! $trap{"sta"}, ! $FL_TRAP | $flags, ! ) unless ($noalert); if( defined($sref->{"_intended"}) ) --- 3938,3951 ---- } syslog ('info', "trap $trap{typ} $trap{spc} from " . "$fromip for $trap{grp} $trap{svc}, status $trap{sta}"); ! debug (1, "trap type=$trap{typ} spc=$trap{spc} from " . ! "$fromip grp=$trap{grp} svc=$trap{svc}, sta=$trap{sta}\n"); ! $sref->{"_trap_duration_timer"} = $sref->{"trapduration"} ! if ($sref->{"trapduration"}); ! process_event ("t", $trap{"grp"}, $trap{"svc"}, $trap{"sta"}, "$trap{sum}\n$trap{dtl}"); if( defined($sref->{"_intended"}) ) *************** *** 4033,4046 **** $sref->{"_trap_timer"} = $sref->{"traptimeout"}; $sref->{"_failure_count"}++; $sref->{"_last_failure"} = $tmnow; ! $sref->{"_first_failure"} = $tmnow if ($sref->{"_op_status"} != $STAT_FAIL); set_op_status ($group, $service, $STAT_FAIL); $sref->{"_last_summary"} = "trap timeout"; ! $sref->{"_last_detail"} = ""; shift @last_failures if (@last_failures > $CF{"MAX_KEEP"}); push @last_failures, "$group $service $tm $sref->{_last_summary}"; syslog ('crit', "failure for $last_failures[-1]"); ! do_alert ($group, $service, undef, undef, $FL_TRAPTIMEOUT); } --- 3968,3990 ---- $sref->{"_trap_timer"} = $sref->{"traptimeout"}; $sref->{"_failure_count"}++; + $sref->{"_consec_failures"}++; $sref->{"_last_failure"} = $tmnow; ! if ($sref->{"_op_status"} == $STAT_OK || ! $sref->{"_op_status"} == $STAT_UNKNOWN || ! $sref->{"_op_status"} == $STAT_UNTESTED) ! { ! $sref->{"_first_failure"} = $tmnow; ! } set_op_status ($group, $service, $STAT_FAIL); $sref->{"_last_summary"} = "trap timeout"; ! $sref->{"_last_detail"} = "trap timeout after " . $sref->{"traptimeout"} . "s at " . localtime ($tmnow) . "\n"; shift @last_failures if (@last_failures > $CF{"MAX_KEEP"}); push @last_failures, "$group $service $tm $sref->{_last_summary}"; syslog ('crit', "failure for $last_failures[-1]"); ! do_alert ($group, $service, "$sref->{_last_summary}\n$sref->{_last_detail}", ! 0, $FL_TRAPTIMEOUT); ! ! $sref->{"_failure_output"} = "$sref->{_last_summary}\n$sref->{_last_detail}"; } *************** *** 4656,4664 **** } - my $t; - $t = "-u" if ($args{"flags"} & $FL_UPALERT); - $t = "-T" if ($args{"flags"} & $FL_TRAP); - $t = "-O" if ($args{"flags"} & $FL_TRAPTIMEOUT); - my @execargs = ( $alert, --- 4600,4603 ---- *************** *** 4669,4675 **** ); ! if ($t) { ! push @execargs, $t; ! } if ($args{"args"} ne "") { --- 4608,4614 ---- ); ! push (@execargs, "-u") if ($args{"flags"} & $FL_UPALERT); ! push (@execargs, "-T") if ($args{"flags"} & $FL_TRAP); ! push (@execargs, "-O") if ($args{"flags"} & $FL_TRAPTIMEOUT); if ($args{"args"} ne "") { |
From: Jim T. <tr...@us...> - 2004-06-28 20:43:06
|
Update of /cvsroot/mon/mon/mon.d In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv1117 Modified Files: Tag: mon-1-0-0pre1 reboot.monitor Log Message: added --community to set the snmp community Index: reboot.monitor =================================================================== RCS file: /cvsroot/mon/mon/mon.d/reboot.monitor,v retrieving revision 1.1.1.1 retrieving revision 1.1.1.1.2.1 diff -C2 -d -r1.1.1.1 -r1.1.1.1.2.1 *** reboot.monitor 9 Jun 2004 05:18:04 -0000 1.1.1.1 --- reboot.monitor 28 Jun 2004 20:42:58 -0000 1.1.1.1.2.1 *************** *** 7,11 **** # options: # ! # reboot.monitor --statefile=filename --dir=dir host1 host2... # # Since this is scheduled from mon, it must maintain state between --- 7,11 ---- # options: # ! # reboot.monitor --statefile=filename --dir=dir [--community=com] host1 host2... # # Since this is scheduled from mon, it must maintain state between *************** *** 47,51 **** ($ME = $0) =~ s-.*/--; ! GetOptions (\%opt, "statefile=s", "dir=s", "verbose"); $STATEDIR = $opt{"dir"} ? $opt{"dir"} --- 47,51 ---- ($ME = $0) =~ s-.*/--; ! GetOptions (\%opt, "statefile=s", "dir=s", "community=s", "verbose"); $STATEDIR = $opt{"dir"} ? $opt{"dir"} *************** *** 56,59 **** --- 56,60 ---- $STATE = "$STATEDIR/$STATEFILE"; + $COMM = $opt{"community"} || "public"; die "$ME: reboot state dir $STATEDIR does not exist\n" *************** *** 90,94 **** foreach $host (@ARGV) { ! if (!defined($s = new SNMP::Session (DestHost => $host, "Version" => 2))) { print "reboot.monitor: cannot create SNMP session to $host\n"; next; --- 91,96 ---- foreach $host (@ARGV) { ! if (!defined($s = new SNMP::Session (DestHost => $host, ! Community => $COMM, "Version" => 2))) { print "reboot.monitor: cannot create SNMP session to $host\n"; next; |
From: Jim T. <tr...@us...> - 2004-06-28 20:10:31
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv26610 Modified Files: Tag: mon-1-0-0pre1 mon Log Message: fixed a problem where traps would not reset the _trap_timer, thus making a trap timeout unpreventable Index: mon =================================================================== RCS file: /cvsroot/mon/mon/mon,v retrieving revision 1.4.2.6 retrieving revision 1.4.2.7 diff -C2 -d -r1.4.2.6 -r1.4.2.7 *** mon 24 Jun 2004 14:18:25 -0000 1.4.2.6 --- mon 28 Jun 2004 20:10:21 -0000 1.4.2.7 *************** *** 391,395 **** if ($sref->{"_trap_timer"} <= 0 && $tm - $sref->{"_last_uptrap"} > $sref->{"traptimeout"}) { - $sref->{"_trap_timer"} = $sref->{"traptimeout"}; handle_trap_timeout ($group, $service); } --- 391,394 ---- *************** *** 3876,3879 **** --- 3875,3886 ---- $sref->{"_last_summary"} = $trap{"sum"}; + # + # a trap recieved resets the trap timeout timer + # + if (exists $sref->{"traptimeout"}) + { + $sref->{"_trap_timer"} = $sref->{"traptimeout"}; + } + if ($intended) { *************** *** 4024,4027 **** --- 4031,4035 ---- my $sref = \%{$watch{$group}->{$service}}; + $sref->{"_trap_timer"} = $sref->{"traptimeout"}; $sref->{"_failure_count"}++; $sref->{"_last_failure"} = $tmnow; |
From: Jim T. <tr...@us...> - 2004-06-28 14:38:09
|
Update of /cvsroot/mon/mon/mon.d In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv23926/mon.d Added Files: Tag: mon-1-0-0pre1 snmpvar.monitor Log Message: assimilated snmpvar.monitor version 1.6.0 from contrib --- NEW FILE: snmpvar.monitor --- #!/usr/bin/perl # ############################################################################ ## ## ## snmpvar.monitor Version 1.6.0 ## ## 2003-05-21 ## ## Copyright (C) 2000-2003 ## ## Peter Holzleitner (pe...@ho...) ## ## ## ############################################################################ # # A MON plug-in monitor to test numeric values retrieved via SNMP # against configured limits. # # Arguments: # # [--community=cmn] [--group=groups] [--timeout=n] [--retries=n] [--debug] # [--varconf=filename] [--config=filename] [--snmpconf=filename] # [--mibs='mib1:mib2:mibn'] [--list[=linesperpage]] host [host ...] # # For every host name passed on the command line, snmpval.monitor looks # up the list of variables and corresponding limits in the configuration # file (snmpmon.cf). # # If a --groups option is present, only those variables are checked # which are in one of the specified groups. To specify more than one # group, separate group names with commas. You can also exclude groups # by prefixing the group name(s) with '-'. Don't mix in- and exclusion. # Examples: # --groups=Power only vars in the Power group # --groups=Power,Env vars in the Power or Env group # --groups=-Power,-Env all vars except those in Power or Env groups # --groups=Power,-Env won't work (only the exclusions) # # For every such variable, it looks up the OID, description etc. from # the variable definition file (snmpvar.def). # # This monitor looks for configuration files in the current directory, # in /etc/mon and /usr/lib/mon/etc. Command line option --varconf # overrides the location of the variable definition file, option # --config sets the configuration file name. # # For formats, please refer to the sample configuration files. # # By default, this monitor does not load any MIB, and OIDs are specified # numerically in the configuration files. Use the option --mibs # to force certain MIBs to be loaded. # # When invoked with the --list option, the output format is changed # into a more human-readable form used to check and troubleshoot the # configuration. This option must not be used from within MON. # # # Exit values: # 0 if everything is OK # 1 if any observed value is outside the specified interval # 2 in case of an SNMP error (e.g. no response from host) # # Requirements: # # UCD SNMP library (3.6.2 or higher) # G.S. Marzot's Perl SNMP module (from CPAN). # # # License: # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software Foundation, # Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA # # # History: # # 1.6.0 21 May 2003 Equal and Non-Equal tests in addition to < and > (P.H.) # 1.5.1 09 Apr 2003 change \w to ^\s in FriendlyName detection to allow # indices containing "." like IP Addresses # 1.5.0 04 Dec 2002 per-host SNMP options (Ryan VanderBijl + P.H.) # --list shows all hosts if none specified (Ryan V.) # more output with --debug option (P.H.) # 1.4.0 10 Sep 2002 extended SNMP configuration (Dan Urist) # 1.3.0 15 May 2002 added GROUP option (Dave Alden) # added DEFAULTGROUP, group exclusion (P.H.) # decimals OK in limits (britcey) # added DefaultMin/Max (P.H.) # 1.2.0 21 Mar 2001 added FriendlyName option (P.H.) # 1.1.2 10 Jul 2000 fixed -l output with plausibility checks (P.H.) # 1.1.1 04 Apr 2000 automatically add dot between OID and index (P.H.) # 1.1.0 30 Mar 2000 added upper and lower plausibility limits (P.H.) # 1.0.1 24 Jan 2000 bugfix: reading Decode definitions (P.H.) # 1.0.0 13 Jan 2000 initial release (P.H.) # use SNMP; use Getopt::Long; use Sys::Syslog; sub ReadVarDef; sub ReadVarList; sub ReadSNMPConf; sub GetSNMPArgs; sub Decode; GetOptions (\%opt, "config=s", "groups=s", "varconf=s", "snmpconf=s", "community=s", "port=i", "timeout=i", "retries=i", "mibs=s", "list:i", "debug"); die "no host arguments\n" if ( (@ARGV == 0) && !exists($opt{'list'}) ); $RET = 0; @ERRS = (); @HOSTS = (); ($^O eq "linux" || $^O eq "openbsd") && Sys::Syslog::setlogsock('unix'); openlog('snmpvar.mon', 'cons,pid', 'daemon'); # find config files $CF1 = '/etc/mon'; $CF2 = '/usr/lib/mon/etc'; $VARCONF_FILE = (-d $CF1 ? $CF1 : $CF2) . '/snmpvar.def'; $MONCONF_FILE = (-d $CF1 ? $CF1 : $CF2) . '/snmpvar.cf'; $SNMPCONF_FILE = (-d $CF1 ? $CF1 : $CF2) . '/snmpopt.cf'; # pick up local config files for testing $VARCONF_FILE = './snmpvar.def' if -e './snmpvar.def'; $MONCONF_FILE = './snmpvar.cf' if -e './snmpvar.cf'; $SNMPCONF_FILE = './snmpopt.cf' if -e './snmpopt.cf'; # commandline ovverides ini any case $VARCONF_FILE = $opt{'varconf'} || $VARCONF_FILE; $MONCONF_FILE = $opt{'config'} || $MONCONF_FILE; $SNMPCONF_FILE = $opt{'snmpconf'} || $SNMPCONF_FILE; print STDERR "\nsnmpvar.monitor: configured from $VARCONF_FILE, $MONCONF_FILE\n\n" if $opt{'debug'}; ReadVarDef($VARCONF_FILE) || die "could not read variable definition: $!\n"; ReadVarList($MONCONF_FILE) || die "could not read config: $!\n"; ReadSNMPConf($SNMPCONF_FILE); # this is optional stuff # load only the necessary MIBs: $ENV{'MIBS'} = $opt{'mibs'} || ''; $FORMAT_LINES_PER_PAGE = $opt{'list'} || 25; $GROUPS = "," . $opt{'groups'} . "," if ($opt{'groups'}); @ARGV = keys %VARLIST if ( exists($opt{'list'}) && @ARGV == 0 ); foreach $host (@ARGV) { $VARS = $VARLIST{$host}; # %VARLIST{$host}{$var}{'MIN'|'MAX'} next unless $VARS; my $SNMPARGS = &GetSNMPArgs($host); if($opt{'debug'}) { print STDERR "$host SNMP Parameters:\n"; foreach $so (keys %SNMPARGS) { print " $so = $SNMPARGS{$so}\n"; } print STDERR "\n"; } if (!defined($s = new SNMP::Session(DestHost => $host, %SNMPARGS))) { $RET = 2 unless $RET > 2; $errmsg = "could not create session to $host: " . $SNMP::Session::ErrorStr; print STDERR "$errmsg\n" if $opt{'debug'}; push (@HOSTS, $host); push (@ERRS, $errmsg); next; } @HE = (); # list of errors for THIS host foreach $var (sort keys %$VARS) { # skip vars that are not in selected group, if any: if($GROUPS ne '') { $g = $$VARS{$var}{'GROUP'}; # assigned group of this variable next if $GROUPS =~ /,-$g,/i; # excluded group next if !($GROUPS =~ /-/) && !($GROUPS =~ /,$g,/i); # included group } $oid = $VARDEF{$var}{'OID'}; @IDX = split(/ +/, $$VARS{$var}{'IDX'}); if(@IDX == ()) { @IDX = (''); } else { $oid .= '.' unless $oid =~ /.+\.$/; } foreach $i (@IDX) { $ioid = $oid . $i; $pi = $i ne '' ? " [$i]" : ''; $descr = $VARDEF{$var}{'DESCR'}; $fn = $FRIENDLYNAME{$host}{$var}{$i} || $VARDEF{$var}{'FNAME'}{$i}; $fn =~ s/^@/$descr /; $vardescr = $fn || $descr . $pi; $rawval = $s->get($ioid); if ($s->{ErrorNum}) { $RET = 2 unless $RET > 2; $errmsg = "error retrieving $host:$var$pi($ioid): " . $s->{ErrorStr}; print STDERR "$errmsg\n" if $opt{'debug'}; push (@HE, $errmsg); next; } $val = eval ($rawval . $VARDEF{$var}{'SCALE'}); $min = $$VARS{$var}{'MIN'}; $max = $$VARS{$var}{'MAX'}; $eq = $$VARS{$var}{'EQ'}; $neq = $$VARS{$var}{'NEQ'}; $minvalid = $$VARS{$var}{'MINVALID'}; $maxvalid = $$VARS{$var}{'MAXVALID'}; $stat = 'OK'; $DEC = $VARDEF{$var}{'DEC'}; $pval = Decode($DEC, $val); $pmin = Decode($DEC, $min); $pmax = Decode($DEC, $max); $peq = Decode($DEC, $eq); $pneq = Decode($DEC, $neq); $pmin = $pmax = $peq if defined($eq); $pmin = $pmax = '!' . $pneq if defined($neq); if(defined($minvalid) && ($val < $minvalid)) { $stat = 'INV<'; syslog('warning', "$host: $vardescr less than lower plausibility limit: $pval"); write if defined $opt{'list'}; next; } if(defined($maxvalid) && ($val > $maxvalid)) { $stat = 'INV>'; syslog('warning', "$host: $vardescr larger than upper plausibility limit: $pval"); write if defined $opt{'list'}; next; } if(defined($min) && ($val < $min)) { $stat = 'FAIL<'; push (@HE, "$vardescr LOW: $pval $VARDEF{$var}{'UNIT'} (<$pmin)"); } if(defined($max) && ($val > $max)) { $stat = 'FAIL>'; push (@HE, "$vardescr HIGH: $pval $VARDEF{$var}{'UNIT'} (>$pmax)"); } if(defined($eq) && ($val != $eq)) { $stat = 'FAIL<>'; push (@HE, "$vardescr: $pval $VARDEF{$var}{'UNIT'} (<> $peq)"); } if(defined($neq) && ($val == $neq)) { $stat = 'FAIL='; push (@HE, "$vardescr: $pval $VARDEF{$var}{'UNIT'} (== $pneq)"); } write if defined $opt{'list'}; } # foreach(index) } # foreach(var) if (@HE) { push (@HOSTS, $host); push (@ERRS, $host . ":\n" . join("\n", @HE)); $RET = 1 unless $RET > 1; # previous error level 2 takes precedence } } # foreach(host) # in case of list output, suppress error listing by exiting here: exit 0 if defined $opt{'list'}; if ($RET) { print "@HOSTS\n\n"; print join("\n", @ERRS), "\n"; } exit $RET; # ---------------------------------------------------------------------- # subroutines begin # ---------------------------------------------------------------------- # # decode enumerations # sub Decode { my ($D, $v) = @_; my $dv; return $v unless $D; # can only decode with valid decoder hash $dv = $$D{$v} || '?'; # look up value return "$dv($v)"; } # # read variable definitions from file # sub ReadVarDef { my ($f) = @_; my ($curvar, $keyword, $param); $curvar = ''; open (CF, $f) || return undef; while (<CF>) { next if (/^\s*#/ || /^\s*$/); chomp; /^\s*(\w*)\s*(.*)/; $keyword = $1; $param = $2; $curvar = $param if $keyword =~ /Variable/i; if($curvar ne '') { $VARDEF{$curvar}{'OID'} = $param if $keyword =~ /OID/i; $VARDEF{$curvar}{'DESCR'} = $param if $keyword =~ /Descr.*/i; $VARDEF{$curvar}{'UNIT'} = $param if $keyword =~ /Unit/i; $VARDEF{$curvar}{'SCALE'} = $param if $keyword =~ /Scale/i; $VARDEF{$curvar}{'DEFIDX'} = $param if $keyword =~ /DefaultIndex/i; $VARDEF{$curvar}{'DEFGRP'} = $param if $keyword =~ /DefaultGroup/i; $VARDEF{$curvar}{'DEFMIN'} = $param if $keyword =~ /DefaultMin/i; $VARDEF{$curvar}{'DEFMAX'} = $param if $keyword =~ /DefaultMax/i; $VARDEF{$curvar}{'DEFEQ'} = $param if $keyword =~ /DefaultEq/i; $VARDEF{$curvar}{'DEFNEQ'} = $param if $keyword =~ /DefaultNEq/i; $VARDEF{$curvar}{'DEFMINVAL'} = $param if $keyword =~ /DefaultMinValid/i; $VARDEF{$curvar}{'DEFMAXVAL'} = $param if $keyword =~ /DefaultMaxValid/i; if($keyword =~ /Decode/i) { $param =~ /\s*([^\s]+)\s+(.*)$/; $VARDEF{$curvar}{'DEC'}{$1} = $2; } if($keyword =~ /FriendlyName/i) { $param =~ /\s*([^\s]+)\s+(.*)$/; $VARDEF{$curvar}{'FNAME'}{$1} = $2; } } } # while(<CF>) close (CF); return 1; } # # read list of variables to be monitored # sub ReadVarList { my ($f) = @_; my ($curhost, $curvar, $var, $param); $curhost = ''; open (CF, $f) || return undef; while (<CF>) { next if (/^\s*#/ || /^\s*$/); chomp; if(/Host\s+(\S+)/i) { $curhost = $1; $curvar = ''; next; } if(/\s+SNMP\s+(\S+)\s+(.+)/i) { next unless $curhost; print "READVARLIST($curhost): SNMP: $1 $2\n"; $SNMP{$curhost}{lc $1} = $2; next; } if(/\s+FriendlyName\s+([^\s]+)\s+(.+)/i) { next unless $curhost; next unless $curvar; $FRIENDLYNAME{$curhost}{$curvar}{$1} = $2; next; } /^\s+(\S+)\s*(.*)$/; $curvar = $1; $param = $2; if($curhost) { $VARLIST{$curhost}{$curvar}{'MIN'} = $VARDEF{$curvar}{'DEFMIN'}; $VARLIST{$curhost}{$curvar}{'MIN'} = $1 if $param =~ /Min\s+([\d\.]+)/i; $VARLIST{$curhost}{$curvar}{'MAX'} = $VARDEF{$curvar}{'DEFMAX'}; $VARLIST{$curhost}{$curvar}{'MAX'} = $1 if $param =~ /Max\s+([\d\.]+)/i; $VARLIST{$curhost}{$curvar}{'EQ'} = $VARDEF{$curvar}{'DEFEQ'}; $VARLIST{$curhost}{$curvar}{'EQ'} = $1 if $param =~ /Eq\s+([\d\.]+)/i; $VARLIST{$curhost}{$curvar}{'NEQ'} = $VARDEF{$curvar}{'DEFNEQ'}; $VARLIST{$curhost}{$curvar}{'NEQ'} = $1 if $param =~ /NEq\s+([\d\.]+)/i; $VARLIST{$curhost}{$curvar}{'MINVALID'} = $VARDEF{$curvar}{'DEFMINVAL'}; $VARLIST{$curhost}{$curvar}{'MINVALID'} = $1 if $param =~ /MinValid\s+([\d\.]+)/i; $VARLIST{$curhost}{$curvar}{'MAXVALID'} = $VARDEF{$curvar}{'DEFMAXVAL'}; $VARLIST{$curhost}{$curvar}{'MAXVALID'} = $1 if $param =~ /MaxValid\s+([\d\.]+)/i; $VARLIST{$curhost}{$curvar}{'IDX'} = $VARDEF{$curvar}{'DEFIDX'}; $VARLIST{$curhost}{$curvar}{'IDX'} = $1 if $param =~ /Index\s+(.+)$/i; $VARLIST{$curhost}{$curvar}{'GROUP'} = $VARDEF{$curvar}{'DEFGRP'}; $VARLIST{$curhost}{$curvar}{'GROUP'} = $1 if $param =~ /Group\s+(.+)$/i; } } # while(<CF>) close (CF); return 1; } sub ReadSNMPConf { my ($f) = @_; my $tag; my $val; if (-r $f) { print STDERR "\nsnmpvar.monitor: reading SNMP options from $f\n" if $opt{'debug'}; open(SNMPCONF, $f) or die "Huh? $f readable but open fails?"; while(<SNMPCONF>) { chomp; next if (/^\s*#/ || /^\s*$/); next unless /^\s*(\S+)\s*=\s*(.+)$/; $SNMPDEF{ lc $1 } = $2; print STDERR "snmpvar.monitor: $1 = $2\n" if $opt{'debug'}; } close SNMPCONF; } print STDERR "\n\n" if $opt{'debug'}; } sub GetSNMPArgs { my ($host) = @_; my $SNMPARGS; # Common options $SNMPARGS{Version} = $SNMP{$host}{version} || $SNMPDEF{version} || 1; $SNMPARGS{RemotePort} = $SNMP{$host}{port} || $opt{'port'} || $SNMPDEF{remoteport} || 161; $SNMPARGS{Retries} = $SNMP{$host}{retries} || $opt{'retries'} || $SNMPDEF{retries} || 8; $SNMPARGS{Timeout} = $SNMP{$host}{timeout} || $opt{'timeout'} || $SNMPDEF{timeout} || 5; # some people may prefer microseconds, but small values should mean seconds: $SNMPARGS{Timeout} *= 1000000 if $SNMPARGS{Timeout} < 1000; # SNMP v.1/v.2 options if ($SNMPARGS{Version} < 3) { $SNMPARGS{Community} = $SNMP{$host}{community} || $opt{'community'} || $SNMPDEF{community} || 'public'; } # SNMP v.3 options if ($SNMPARGS{Version} == 3) { $SNMPARGS{SecName} = $SNMP{$host}{secname} || $SNMPDEF{secname} || 'initial'; $SNMPARGS{SecLevel} = $SNMP{$host}{seclevel} || $SNMPDEF{seclevel} || 'noAuthNoPriv'; $SNMPARGS{AuthPass} = $SNMP{$host}{authpass} || $SNMPDEF{authpass} || ''; $SNMPARGS{SecEngineId} = $SNMP{$host}{secengineid} || $SNMPDEF{secengineid} || ''; $SNMPARGS{ContextEngineId} = $SNMP{$host}{contextengineid} || $SNMPDEF{contextengineid} || ''; $SNMPARGS{Context} = $SNMP{$host}{context} || $SNMPDEF{context} || ''; $SNMPARGS{AuthProto} = $SNMP{$host}{authproto} || $SNMPDEF{authproto} || 'MD5'; $SNMPARGS{PrivProto} = $SNMP{$host}{privproto} || $SNMPDEF{privproto} || 'DES'; $SNMPARGS{PrivPass} = $SNMP{$host}{privpass} || $SNMPDEF{privpass} || ''; } return %SNMPARGS; } format STDOUT_TOP = Host Variable min value max stat ---------------------------------------------------------------------------- . format STDOUT = @<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<< @>>>>> @>>>>>> @<<< @>>>>> @<<<<< $host, $vardescr, $pmin, $pval, $VARDEF{$var}{'UNIT'}, $pmax, $stat . |
From: Jim T. <tr...@us...> - 2004-06-28 14:38:09
|
Update of /cvsroot/mon/mon/etc In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv23926/etc Added Files: Tag: mon-1-0-0pre1 snmpopt.cf snmpvar.cf snmpvar.def Log Message: assimilated snmpvar.monitor version 1.6.0 from contrib --- NEW FILE: snmpopt.cf --- # # snmpopt.cf # # This optional file is used to pass parameters to the SNMP library, # used by snmpvar.monitor. # # (default values shown) # common options # Version = 1 # Port = 161 # Retries = 8 # Timeout = 5 # SNMPv1/v2 options # Community = public # SNMPv3 options # SecName = initial # SecLevel = noAuthNoPriv # AuthPass = # SecEngineId = # ContextEngineId = # Context # AuthProto = MD5 # PrivProto = DES # PrivPass = --- NEW FILE: snmpvar.cf --- # # snmpvar.cf # # this is a sample configuration file for snmpvar.monitor. you # must configure this to meet your own needs. # # list of variables and ranges to be monitored by snmpvar.monitor # refers to variables defined in snmpvar.def # # a Dell server, RAID instrumentation only: Host nov-1 MEGARAID0_LOGICAL_STATUS Min 2 Max 2 Index 0 MEGARAID0_PHYS_STATUS Min 3 Max 3 Index 0 1 2 3 4 5 # a Compaq server: Host nov-2 # has 1 RAID volume, 6 physical disks CPQARRAY_LOG_STATUS Index 1 CPQARRAY_PHYS_STATUS Index 0 1 2 3 4 5 PROLIANT_TEMP_STATUS PROLIANT_PSU_STATUS PROLIANT_FAN_STATUS Index 2 4 5 # a Dell server running NT 4 with perfmib Host ntserv1 WINNT_MEM_COMMITTED Max 700 WINNT_LOGICAL_C_FREE Min 50 WINNT_LOGICAL_D_FREE Min 50 MEGARAID_C0_LOGICAL_STATUS Index 0 MEGARAID_C0_CH0_PHYS_STATUS Index 0 1 2 3 4 PE4300_TEMP_CPU PE4300_TEMP PE4300_5V_CURRENT PE4300_12V_CURRENT PE4300_3V_CURRENT PE4300_FAN_CPU_RPM PE4300_FAN_DISK_RPM PE4X00_PSU_STATUS # an APC UPS (with SNMP adapter or through controlling server running PowerNet) Host srvups1 APCUPS_OUTPUT_STAT APCUPS_LINEVOLT_MAX APCUPS_LINEVOLT_MIN # here, we override the default maximum specified in snmpvar.def: APCUPS_LOAD Max 75 APCUPS_BATT_TEMP # these are the MeasureUPS parameters (external sensor) APCUPS_EXT_TEMP Max 32 APCUPS_EXT_HUMID Min 10 Max 90 APCUPS_EXT_SWITCH_STAT Min 2 Max 2 Index 1 FriendlyName 1 Diesel Generator Status # an HP ProCurve 4000 switch Host hp4000-servers HP_ICF_FAN_STATE # has redundant PSU HP_ICF_PSU_STATE Index 2 3 IF_OPERSTAT Index 1 3 17 25 65 73 FriendlyName 1 A1: Server LAUREL FriendlyName 3 A3: Server HARDY FriendlyName 17 C1: Server TITAN (1000SX) FriendlyName 25 D1: Server MERCURY (1000SX) FriendlyName 65 I1: Switch D1017:G1 (1000TX) FriendlyName 73 J1: Switch SERVERS1:H1 (1000SX) # an IBM8272 Token Ring switch Host trsw1 IBM8272_LINK_STATE Min 1 Max 1 Index 1 2 3 4 5 6 7 9 11 12 13 14 15 16 17 18 21 22 23 24 FriendlyName 1 1: Floor 10 Ring FriendlyName 2 2: Floor 12 Ring FriendlyName 3 3: Floor 13 Ring FriendlyName 9 9: Server NOV-1 FriendlyName 13 13: Server ntserv1 FriendlyName 18 18: Switch 2 Interlink Fibre IBM8272_TEMP_SYS Min 1 Max 1 # a cisco router Host cisco1 IF_OPERSTAT Index 1 2 3 4 FriendlyName 1 1: Internal Ethernet FriendlyName 2 2: Internal TokenRing FriendlyName 3 3: Firewall BGP_PEERSTATE Index 10.1.1.1 10.2.1.1 FriendlyName 10.1.1.1 iBGP Session: myotherrouter FriendlyName 10.2.1.1 eBGP Session: Provider X CISCO_TEMP_STATE # a Nokia IP series firewall appliance Host firewall IF_OPERSTAT Index 1 2 3 FriendlyName 1 1: Leased Line FriendlyName 2 2: DMZ FriendlyName 3 3: Internal Router NOKIA_IP_CHASSIS_TEMP NOKIA_IP_FAN_STAT NOKIA_IP_PSU_STAT NOKIA_IP_PSU_TEMP # a Linux server with some private SNMP extensions Host mailserver LINUX_MAILQUEUE Max 80 --- NEW FILE: snmpvar.def --- # # sample snmpvar.def. you should configure this to meet your # own needs. # # Definitions of variables to be monitored using snmpvar.monitor # # # generic host (router/switch/...) Variable IF_OPERSTAT OID .1.3.6.1.2.1.2.2.1.8 Description ifOperStatus DefaultEQ 1 Decode 1 up Decode 2 down Decode 3 testing Decode 4 unknown Decode 5 dormant # generic router Variable BGP_PEERSTATE OID .1.3.6.1.2.1.15.3.1.2 Description bgpPeerState DefaultEQ 6 Decode 1 idle Decode 2 connect Decode 3 active Decode 4 opensent Decode 5 openconfirm Decode 6 established # generic Host Resources MIB implementation Variable HR_DEVICE_STATUS OID .1.3.6.1.2.1.25.3.2.1.5. Description Device Status DefaultEQ 2 Decode 1 unknown Decode 2 running Decode 3 warning Decode 4 testing Decode 5 down # some variables from a Windows NT "perfmib" configuration # see ms-perfmib directory for NT side configuration Variable WINNT_CPU_TOTAL OID .1.3.6.1.4.1.311.1.1.3.1.1.1.9.0 Description CPU Load Total Unit % Variable WINNT_CPU_SYS OID .1.3.6.1.4.1.311.1.1.3.1.1.1.11.0 Description CPU Load System Unit % Variable WINNT_MEM_COMMITTED OID .1.3.6.1.4.1.311.1.1.3.1.1.2.2.0 Description Committed Memory Scale / 1024 / 1024 # the Scale expression is used as (eval($rawval . $scale)) Unit MB Variable WINNT_MEM_AVAILABLE OID .1.3.6.1.4.1.311.1.1.3.1.1.2.1.0 Description Available Memory Scale / 1024 /1024 Unit MB Variable WINNT_LOGICAL_C_FREE OID .1.3.6.1.4.1.311.1.1.3.1.1.6.1.4.6.48.58.48.58.67.58 Description Free Disk Space on drive C Unit MB Variable WINNT_LOGICAL_D_FREE OID .1.3.6.1.4.1.311.1.1.3.1.1.6.1.4.6.48.58.48.58.68.58 Description Free Disk Space on drive D Unit MB # Dell PowerEdge 2550 Server Instrumentation Variable PE2550_FAN_SYS_RPM OID .1.3.6.1.4.1.674.10892.1.700.12.1.6.1. Description System Fan Speed DefaultIndex 1 2 3 Unit rpm DefaultMin 600 DefaultMax 6000 DefaultMaxValid 10000 DefaultGroup Environment Variable PE2550_FAN_DISK_RPM OID .1.3.6.1.4.1.674.10892.1.700.12.1.6.1.4 Description Disk Fan Speed Unit rpm DefaultMin 6000 DefaultMax 14000 DefaultMaxValid 15000 DefaultGroup Environment Variable PE2550_TEMP_CPU OID .1.3.6.1.4.1.674.10892.1.700.20.1.6.1. Description CPU Temperature DefaultIndex 1 2 Unit C Scale / 10.0 DefaultMax 50 DefaultGroup Environment Variable PE2550_TEMP OID .1.3.6.1.4.1.674.10892.1.700.20.1.6.1. Description Temperature DefaultIndex 3 4 5 FriendlyName 3 Motherboard FriendlyName 4 Backplane 1 FriendlyName 5 Backplane 2 Unit C Scale / 10.0 DefaultMax 40 DefaultGroup Environment Variable PE2550_PSU_STATUS DefaultIndex 1 2 OID .1.3.6.1.4.1.674.10892.1.600.12.1.5.1. Description Power Supply Status DefaultEQ 3 Decode 1 other Decode 2 unknown Decode 3 OK Decode 4 noncrit Decode 5 critical Decode 6 nonrecoverable DefaultGroup Power # Dell PowerEdge 4300 Server Instrumentation Variable PE4300_TEMP_CPU OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.1. Description CPU Temperature DefaultIndex 1 2 Scale / 10.0 Unit C DefaultMax 40 DefaultGroup Environment Variable PE4300_TEMP OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.1. Description Temperature DefaultIndex 3 4 5 6 FriendlyName 3 @Motherboard FriendlyName 4 @Ambient FriendlyName 5 @Backplane 1 FriendlyName 6 @Backplane 2 Scale / 10.0 Unit C DefaultMax 40 DefaultGroup Environment Variable PE4300_5V_CURRENT OID .1.3.6.1.4.1.674.10891.303.1.5.2.5.1. Description DC Current (+5V) DefaultIndex 1 4 7 Scale / 1000.0 Unit A DefaultMax 25 DefaultMaxValid 100 DefaultGroup Power Variable PE4300_12V_CURRENT OID .1.3.6.1.4.1.674.10891.303.1.5.2.5.1. Description DC Current (+12V) DefaultIndex 2 5 8 Scale / 1000.0 Unit A DefaultMax 10 DefaultMaxValid 100 DefaultGroup Power Variable PE4300_3V_CURRENT OID .1.3.6.1.4.1.674.10891.303.1.5.2.5.1. Description DC Current (+3V) DefaultIndex 3 6 9 Scale / 1000.0 Unit A DefaultMax 10 DefaultMaxValid 100 DefaultGroup Power Variable PE4300_FAN_CPU_RPM OID .1.3.6.1.4.1.674.10891.301.1.5.2.3.1. Description CPU Fan Speed Unit rpm DefaultIndex 1 2 DefaultMin 1000 DefaultMax 5000 DefaultMaxValid 10000 DefaultGroup Environment # really the same as above, other index ranges only; different description # one could also make it an array and use FriendlyName in the .cf file Variable PE4300_FAN_DISK_RPM OID .1.3.6.1.4.1.674.10891.301.1.5.2.3.1. Description Disk Fan Speed Unit rpm DefaultIndex 3 4 5 DefaultMin 1000 DefaultMax 5000 DefaultMaxValid 10000 DefaultGroup Environment Variable PE4X00_PSU_STATUS DefaultIndex 1 2 3 OID .1.3.6.1.4.1.674.10891.304.1.4.2.6.1. Description Power Supply Status DefaultEQ 3 Decode 1 other Decode 2 unknown Decode 3 OK Decode 4 noncrit Decode 5 critical Decode 6 nonrecoverable DefaultGroup Power Variable PE4X00_EXT_DISK1_PSU_STATUS DefaultIndex 1 2 OID .1.3.6.1.4.1.674.10891.304.1.4.2.6.2. Description ExtStorage 1 PSU Status DefaultEQ 3 Decode 1 other Decode 2 unknown Decode 3 OK Decode 4 noncrit Decode 5 critical Decode 6 nonrecoverable DefaultGroup Power # Dell PowerEdge 6350 Server Instrumentation Variable PE6350_TEMP_CPU OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.1. Description CPU Temperature DefaultIndex 1 2 3 4 Scale / 10.0 Unit C DefaultMax 55 DefaultGroup Environment Variable PE6350_TEMP OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.1. Description Temperature DefaultIndex 5 6 7 FriendlyName 5 @Motherboard FriendlyName 6 @Ambient FriendlyName 7 @Backplane Scale / 10.0 Unit C DefaultMax 40 DefaultGroup Environment Variable PE6350_TEMP_EXT_DISK1 OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.2.1 Description ExtStorage 1 Temperature Scale / 10.0 Unit C DefaultGroup Environment Variable PE6350_FAN_RPM OID .1.3.6.1.4.1.674.10891.301.1.5.2.3.1. Description Fan Speed DefaultIndex 1 2 3 4 Unit rpm DefaultMin 1000 DefaultMax 5000 DefaultMaxValid 10000 DefaultGroup Environment Variable PE6350_FAN_RPM_EXT_DISK1 OID .1.3.6.1.4.1.674.10891.301.1.5.2.3.2. Description ExtStorage 1 Fan Speed DefaultIndex 1 2 3 Unit rpm DefaultMin 1000 DefaultMax 5000 DefaultMaxValid 10000 DefaultGroup Environment # Dell PowerEdge 4200 Server Instrumentation Variable PE4200_TEMP_CPU OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.1. Description CPU Temperature DefaultIndex 1 2 Scale / 10.0 Unit C DefaultMax 40 DefaultGroup Environment Variable PE4200_TEMP OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.1. Description Temperature DefaultIndex 3 4 5 6 FriendlyName 3 @Ambient FriendlyName 4 @Panel FriendlyName 5 @Backplane Top FriendlyName 6 @Backplane Bottom Scale / 10.0 Unit C DefaultMax 35 DefaultGroup Environment Variable PE4200_PSU_5V_CURRENT OID .1.3.6.1.4.1.674.10891.303.1.5.2.5.1. Description DC Current (+5V) DefaultIndex 1 2 FriendlyName 1 @Top PSU FriendlyName 2 @Bottom PSU Scale / 1000.0 Unit A DefaultMax 10 DefaultMaxValid 50 DefaultGroup Power Variable PE4200_PSU_3V_CURRENT OID .1.3.6.1.4.1.674.10891.303.1.5.2.5.1. Description DC Current (+3.3V) DefaultIndex 3 4 FriendlyName 3 @Top PSU FriendlyName 4 @Bottom PSU Scale / 1000.0 Unit A DefaultMax 5 DefaultMaxValid 50 DefaultGroup Power Variable PE4200_PSU_12V_CURRENT OID .1.3.6.1.4.1.674.10891.303.1.5.2.5.1. Description DC Current (+12V) DefaultIndex 5 6 FriendlyName 5 @Top PSU FriendlyName 6 @Bottom PSU Scale / 1000.0 Unit A DefaultMax 10 DefaultMaxValid 50 DefaultGroup Power Variable PE4200_FAN_RPM OID .1.3.6.1.4.1.674.10891.301.1.5.2.3.1. Description Fan Speed Unit rpm DefaultIndex 1 3 4 5 # Fan #2 is a standby unit FriendlyName 1 @Chassis 1 FriendlyName 2 @Chassis 2 FriendlyName 3 @Chassis 3 FriendlyName 4 @Top PSU FriendlyName 5 @Bottom PSU DefaultMin 1000 DefaultMax 5000 DefaultMaxValid 10000 DefaultGroup Environment # AMI MegaRAID (aka Dell PERC) RAID controller instrumentation Variable MEGARAID_C0_LOGICAL_STATUS OID .1.3.6.1.4.1.3582.1.1.2.1.3.0. Description RAID Ctl0 Volume Status DefaultEQ 2 Decode 0 offline Decode 1 degraded Decode 2 normal Decode 3 initialize Decode 4 checkconsistency Variable MEGARAID_C1_LOGICAL_STATUS OID .1.3.6.1.4.1.3582.1.1.2.1.3.1. Description RAID Ctl1 Volume Status DefaultEQ 2 Decode 0 offline Decode 1 degraded Decode 2 normal Decode 3 initialize Decode 4 checkconsistency Variable MEGARAID_C0_CH0_PHYS_STATUS OID .1.3.6.1.4.1.3582.1.1.3.1.4.0.0. Description Ctl0Ch0 Phys Drive Status DefaultEQ 3 Decode 1 ready Decode 3 online Decode 4 failed Decode 5 rebuild Decode 6 hotspare Decode 20 nondisk Variable MEGARAID_C1_CH0_PHYS_STATUS OID .1.3.6.1.4.1.3582.1.1.3.1.4.1.0. Description Ctl1Ch0 Phys Drive Status DefaultEQ 3 Decode 1 ready Decode 3 online Decode 4 failed Decode 5 rebuild Decode 6 hotspare Decode 20 nondisk Variable MEGARAID_C1_CH1_PHYS_STATUS OID .1.3.6.1.4.1.3582.1.1.3.1.4.1.1. Description Ctl1Ch1 Phys Drive Status DefaultEQ 3 Decode 1 ready Decode 3 online Decode 4 failed Decode 5 rebuild Decode 6 hotspare Decode 20 nondisk # APC SmartUPS monitoring (using PowerNet SNMP agents or SNMP adapter boards) Variable APCUPS_LINEVOLT_MAX OID .1.3.6.1.4.1.318.1.1.1.3.2.2.0 Description Recent Max Line Voltage Unit V DefaultMax 245 DefaultGroup Power Variable APCUPS_LINEVOLT_MIN OID .1.3.6.1.4.1.318.1.1.1.3.2.3.0 Description Recent Min Line Voltage Unit V DefaultMin 205 DefaultGroup Power Variable APCUPS_LOAD OID .1.3.6.1.4.1.318.1.1.1.4.2.3.0 Description Output Load Unit % DefaultMax 90 DefaultGroup Power Variable APCUPS_BATT_TEMP OID .1.3.6.1.4.1.318.1.1.1.2.2.2.0 Description Battery Temperature Unit C DefaultMax 45 DefaultGroup Environment # external sensors connected to a MeasureUPS board Variable APCUPS_EXT_TEMP OID .1.3.6.1.4.1.318.1.1.2.1.1.0 Description Temperature Unit C DefaultGroup Environment Variable APCUPS_EXT_HUMID OID .1.3.6.1.4.1.318.1.1.2.1.2.0 Description Humidity Unit % DefaultMin 10 DefaultMax 90 DefaultGroup Environment Variable APCUPS_EXT_SWITCH_STAT OID .1.3.6.1.4.1.318.1.1.2.2.2.1.5 Description Contact Decode 1 unknown Decode 2 OK Decode 3 FAULT Variable APCUPS_OUTPUT_STAT OID .1.3.6.1.4.1.318.1.1.1.4.1.1.0 Description UPS Status DefaultEQ 2 Decode 1 unknown Decode 2 Online Decode 3 On Battery Decode 4 On Smart Boost Decode 5 Timed Sleeping Decode 6 Software Bypass Decode 7 Off Decode 8 Rebooting Decode 9 Switched Bypass Decode 10 Hardware Failure Bypass Decode 11 Sleeping Until Power Return Decode 12 On Smart Trim DefaultGroup Power # Compaq ProLiant Server Instrumentation Variable PROLIANT_TEMP_STATUS OID .1.3.6.1.4.1.232.6.2.6.3.0 Description Temperature Status DefaultEQ 2 Decode 1 Other Decode 2 OK Decode 3 Degraded Decode 4 FAILED DefaultGroup Environment Variable PROLIANT_FAN_STATUS OID .1.3.6.1.4.1.232.6.2.6.7.1.9.0. Description Fan Status DefaultEQ 2 Decode 1 Other Decode 2 OK Decode 3 Degraded Decode 4 FAILED DefaultGroup Environment Variable PROLIANT_PSU_STATUS OID .1.3.6.1.4.1.232.6.2.9.3.1.5.0. Description Power Supply Status DefaultIndex 1 2 DefaultEQ 1 Decode 1 OK Decode 2 Failure Decode 3 BIST Failure Decode 4 Fan Failure Decode 5 Temp Failure Decode 6 Interlock Open DefaultGroup Power Variable CPQARRAY_LOG_STATUS OID .1.3.6.1.4.1.232.3.2.3.1.1.4.1. Description RAID Volume Status DefaultIndex 1 DefaultEQ 2 Decode 1 Other Decode 2 OK Decode 3 FAILED Decode 4 Unconfigured Decode 5 Recovering Decode 6 Ready For Rebuild Decode 7 Rebuilding Decode 8 Wrong Drive Decode 9 Bad Connect Decode 10 Overheating Decode 11 Shutdown Decode 12 expanding Decode 13 Not Available Decode 14 Queued For Expansion Variable CPQARRAY_PHYS_STATUS OID .1.3.6.1.4.1.232.3.2.5.1.1.6.1. Description Phys Drive Status DefaultEQ 2 Decode 1 Other Decode 2 OK Decode 3 Failed Decode 4 Predictive Failure # IBM 8272 Token Ring switch Variable IBM8272_LINK_STATE OID .1.3.6.1.4.1.2.6.66.1.2.2.1.1.15. Description Link State DefaultEQ 1 Decode 1 up Decode 2 down Variable IBM8272_TEMP_SYS OID .1.3.6.1.4.1.2.6.66.1.2.1.2.11.0 Description Switch Temperature DefaultEQ 1 Decode 1 normal Decode 2 HIGH DefaultGroup Environment # Nokia IP series firewall appliance Variable NOKIA_IP_CHASSIS_TEMP OID .1.3.6.1.4.1.94.1.21.1.1.5.0 Description Chassis Temperature DefaultEQ 1 Decode 1 normal Decode 2 OVERTEMP DefaultGroup Environment Variable NOKIA_IP_FAN_STAT OID .1.3.6.1.4.1.94.1.21.1.2.1.1.2. Description Fan Status DefaultEQ 1 Decode 1 running Decode 2 DEAD DefaultGroup Environment Variable NOKIA_IP_PSU_STAT OID .1.3.6.1.4.1.94.1.21.1.3.1.1.3. Description PSU Status DefaultEQ 1 Decode 1 running Decode 2 DEAD DefaultGroup Environment Variable NOKIA_IP_PSU_TEMP OID .1.3.6.1.4.1.94.1.21.1.3.1.1.2. Description Chassis Temperature DefaultEQ 1 Decode 1 normal Decode 2 OVERTEMP DefaultGroup Environment # Mail Server (custom extension scripts in UCD SNMP agent) Variable LINUX_MAILQUEUE OID .1.3.6.1.4.1.2021.8.1.101.1 Description Mail Queue Length # see sample in ucd-snmp subdir in snmpvar.monitor distribution # cisco router # ciscoEnvMonTemperatureState Variable CISCO_TEMP_STATE OID .1.3.6.1.4.1.9.9.13.1.3.1.6. Description Chassis Temperature DefaultIndex 1 DefaultEQ 1 Decode 1 normal Decode 2 Warning Decode 3 CRITICAL Decode 4 SHUTDOWN Decode 5 not present DefaultGroup Environment Variable CISCO_MEM_POOL_FREE OID .1.3.6.1.4.1.9.9.48.1.1.1.6. Description Memory Pool Free Bytes DefaultIndex 1 2 FriendlyName 1 CPU FriendlyName 2 I/O # HP switch # hpicfSensorStatus Variable HP_ICF_FAN_STATE OID .1.3.6.1.4.1.11.2.14.11.1.2.6.1.4.1 Description Fan Status DefaultEQ 4 Decode 1 unknown Decode 2 bad Decode 3 warning Decode 4 good Decode 5 not present DefaultGroup Environment Variable HP_ICF_PSU_STATE OID .1.3.6.1.4.1.11.2.14.11.1.2.6.1.4. Description PSU Status DefaultEQ 4 Decode 1 unknown Decode 2 bad Decode 3 warning Decode 4 good Decode 5 not present DefaultGroup Power |
From: Jim T. <tr...@us...> - 2004-06-28 14:38:09
|
Update of /cvsroot/mon/mon/doc In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv23926/doc Added Files: Tag: mon-1-0-0pre1 README.snmpvar.monitor Log Message: assimilated snmpvar.monitor version 1.6.0 from contrib --- NEW FILE: README.snmpvar.monitor --- snmpvar.monitor by P.Holzleitner What does it do? snmpvar.monitor is a plug-in for the "mon" systems monitoring package written by Jim Trockij (http://www.kernel.org/software/mon). Called by mon, it queries freely configurable values using SNMP, compares them against specified limits and reports any violation. Some parameters that can be monitored (just to give you an idea): Equipment operational status (temperature, fan rotation) UPS Status (line power / battery, minimum line voltage, load % ...) Switch/Router status (interface up, BGP session up, ...) Server status (redundant power supply OK, disk array OK, ...) Status of services (process running, mail queue length, ...) License GNU GPLv2 (http://www.fsf.org/licenses/gpl.txt) - See file COPYING Quick Start: * Make sure you have UCD SNMP 3.6.2+ (libraries) and the Perl SNMP module installed (http://www.cpan.org/misc/cpan-faq.html) * Copy snmpvar.mon to your mon.d directory * Copy snmpvar.def to /etc/mon, add your own variables * Copy snmpvar.cf to /etc/mon and edit to match your needs * Test from mon.d directory with ./snmpvar.monitor -l host1 host2 ... * Test again from mon.d directory with ./snmpvar.monitor host1 host2 ... * Add watch/service to mon.cf, using snmpvar.monitor Commandline options: --varconf=/path/to/snmpvar.def if neither /etc/mon nor /usr/lib/mon/etc --config=/path/to/snmpvar.cf if neither /etc/mon nor /usr/lib/mon/etc --community=your_SNMP_read_community if not 'public' --groups=Power,Disks test only a subset of variables for a host group --timeout=n SNMP GET timeout in seconds --retries=n number of times to retry the SNMP GET --debug tell what config is being useed --mibs='mib1:mib2:mibn' load specified MIBs --list[=linesperpage]] produce human-readable listing, not alarms For every host name passed on the command line, snmpval.monitor looks up the list of variables and corresponding limits in the configuration file (snmpmon.cf). If a --groups option is present, only those variables are checked which are in one of the specified groups. To specify more than one group, separate group names with commas. You can also exclude groups by prefixing the group name(s) with '-'. Don't mix in- and exclusion. Examples: --groups=Power only vars in the Power group --groups=Power,Env vars in the Power or Env group --groups=-Power,-Env all vars except those in Power or Env groups --groups=Power,-Env won't work (only the exclusions) For every such variable, it looks up the OID, description etc. from the variable definition file (snmpvar.def). This monitor looks for configuration files in the current directory, in /etc/mon and /usr/lib/mon/etc. Command line option --varconf overrides the location of the variable definition file, option --config sets the configuration file name. When invoked with the --list option, the output format is changed into a more human-readable form used to check and troubleshoot the configuration. This option must not be used from within MON. Exit values: 0 if everything is OK 1 if any observed value is outside the specified interval 2 in case of an SNMP error (e.g. no response from host) Basic Troubleshooting: use snmpvar.monitor --list option to see variable values use snmpwalk your_hostname public .1 | less to verify SNMP agent The snmpvar.def File: In this file we define variables that can be retrieved via SNMP. In a way, the .def file is snmpvar.monitor's idea of a MIB. Entries consist of a "Variable variable-name" declaration Variable PE4300_TEMP_MB [NOTE: The variable name cannot be "Host" or "FriendlyName"] followed by the mandatory specification of Object ID and Description: OID .1.3.6.1.4.1.674.10891.300.1.5.2.2.1.3 Description Motherboard Temperature It is suggested that OIDs be entered numerically as shown above in order to eliminate the need for having the SNMP libraries compile the relevant MIB files on every invocation of the monitor. By default, this monitor loads no MIBs. If you want to use symbolic OIDs, use the --mibs commandline option to specify which MIBs you need. By the author's convention, an OID describing an array of values, like ifOperStat which takes the interface number as an index, is written with a trailing dot, while OIDs of scalars end in a number. As of version 1.1.1, the monitor will insert the dot before the index if you forgot it in the .def file. Optional Elements of a Variable definition: DefaultIndex 3 4 5 A list of indices to test by default. Let's say the OID is .1.2.3. and DefaultIndex is "18 22 36", then the monitor will retrieve the values of .1.2.3.18, .1.2.3.22 and .1.2.3.36 when testing this variable, and will compare them all against the limits. Where necessary, the DefaultIndex can be overridden for one host/variable combination, using the Index statement in the .cf file. FriendlyName 3 Disk Fan 1 This lets you replace the standard display of "Variable [Index]", e.g. "Fan Speed [5]", with individual labels for each index. The FriendlyName option is typically specified in the .def file for items that have the same name for every use, e.g. component names like in the case of fans, power supplies etc. The same option exists in the .cf file to name a particular variable on a particular host, e.g. to display a line name instead of an interface number on a router. If the FriendlyName string begins with "@", the Description is substituted for the "@". Scale / 10.0 A formula to re-scale the value returned from the host. The expression is appended to the raw value and the resulting expression is evaluated by Perl. The raw value is available as $rawval if necessary. Unit C Used in value display / messages, Decode 1 unknown Decode 2 OK Decode 3 FAILURE Values retrieved through SNMP are often enumerations of status codes. The Decode statement lets you put text labels on these values. DefaultGroup Environment Defines that all, by default, instances of this variable go into the specified group. Individual overrides possible in .cf file. DefaultMin 300 DefaultMax 2000 DefaultEQ 1000 DefaultNEQ 1000 Default alarm limits. See description of Min/Max/EQ/NEQ below. The snmpvar.cf File: In here, you "call up" the variables to be retrieved for a particular host. Entries consist of a "Host host-name" declaration followed by at least one "variable-name [options ...]" line. Host ntserv1 This hostname corresponds to the hostname on the command line, i.e. the hostname you used in MON's hostgroup statement. FOO_FAN_RPM Min 1000 Max 5000 MaxValid 10000 Index 1 2 3 4 This example uses almost all options. It instructs the monitor to retrieve the OID specified under "FOO_FAN_RPM" in the .def file. Min 300 specifies a minimum value, measured >= minimum Max 2000 specifies a maximum value, measured <= maximum EQ 1000 specifies a exact value, measured == maximum NEQ 1000 specifies a exact value, measured != maximum If the measured value is outside of these limits, a failure is reported. To test for "Value = X", use "Min X Max X". MinValid -1 MaxValid 10000 Some monitoring hardware occasionally measures garbage. To avoid triggering an alarm when this happens, you can use MinValid/MaxValid to specify the range (inclusive) of plausible values for this variable. If the measured value exceeds these limits, only a warning will be generated, but no failure will be reported to MON. Group Environment Puts this particular variable into the specified group. Groups are used to test a partial set of the variables specified for a host, by using the --groups= command line option. Index 1 2 3 This tells the monitor which object instances (array elements) to test in case of a non-scalar object. Since the list of indices can be as long as necessary, the Index option must be the last one on the line (after Min X, Max Y etc.) The list specified as DefaultIndex in the .def file entry for this variable is used unless Index is pecified here. When retrieving a non-scalar value, the snmpvar.monitor will normally display the instances (array elements) by appending their index to the description, as in "Line Status [3]". Often, it is desirable to label individual instances in a more mnemonic way. To do this, you can add a number of FriendlyName directives after a variable request, like this: Host firewall IF_OPERSTAT Index 1 2 3 FriendlyName 1 1: Leased Line FriendlyName 2 2: DMZ FriendlyName 3 3: Internal Router In this case, the monitor checks the ifOperStat for interfaces 1, 2, and 3 on host "firewall". If interface 3 were not "up", the monitor would signal a failure of "Internal Router" instead of "ifOperStat [3]". If the FriendlyName string begins with "@", the Description is substituted for the "@". If all instances of this variable having the same index have the same meaning regardless of what host they are on, you can put the FriendlyName statement into te respective variable definition in the .def file instead. The snmpopt.cf File: This optional file is used to pass parameters to the SNMP library. For SNMPv1, this is generally not necessary unless the target's SNMP port differs from the default (161). Note that SNMPv1 community string, timeout and retries can also be specified on the snmpvar.monitor command line, overriding whatever default or configuration file setting. You will need to edit this file in order to use SNMPv3. |
From: David N. <vi...@us...> - 2004-06-27 19:05:49
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18179 Modified Files: mon Log Message: Save trap_timer during opstatus save/load. This is so that trap timeouts which are longer then how often you restart Mon can actually occur. Index: mon =================================================================== RCS file: /cvsroot/mon/mon/mon,v retrieving revision 1.8 retrieving revision 1.9 diff -C2 -d -r1.8 -r1.9 *** mon 24 Jun 2004 21:00:09 -0000 1.8 --- mon 27 Jun 2004 19:05:40 -0000 1.9 *************** *** 3743,3747 **** last_failure_time last_failure_summary last_failure_detail last_detail ack ack_comment last_trap last_traphost exitval ! last_check last_op_status failure_output)) { print STATE "\t$var=" . esc_str($watch{$group}->{$service}->{"_$var"}); } --- 3743,3747 ---- last_failure_time last_failure_summary last_failure_detail last_detail ack ack_comment last_trap last_traphost exitval ! last_check last_op_status failure_output trap_timer)) { print STATE "\t$var=" . esc_str($watch{$group}->{$service}->{"_$var"}); } |
From: Jim T. <tr...@us...> - 2004-06-24 21:00:17
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv12657 Modified Files: mon Log Message: slightly reformatted the alertevery handling code in do_alert so that it's a little easier to read, and i added blow-by-blow comments to the right margin. ported "alertevery .. strict" code from the mon-1-0-0pre1 branch. tidied up the parsing of alertevery in read_cf. made all the options exclusive because they are, and the config parser shouldn't accept multiples of them. Index: mon =================================================================== RCS file: /cvsroot/mon/mon/mon,v retrieving revision 1.7 retrieving revision 1.8 diff -C2 -d -r1.7 -r1.8 *** mon 24 Jun 2004 19:11:54 -0000 1.7 --- mon 24 Jun 2004 21:00:09 -0000 1.8 *************** *** 670,683 **** # # only alert once every "alertevery" seconds, unless ! # output from monitor is different # ! if ($pref->{"alertevery"} != 0 ! && ($tmnow - $pref->{"_last_alert"} < $pref->{"alertevery"}) ! && (($pref->{"_observe_detail"} ! && $sref->{"_failure_output"} eq $output) ! || (!$pref->{"_observe_detail"} ! && (!$pref->{"_ignore_summary"}) ! && ($prevsumm eq $summary)) ! || ($pref->{"_ignore_summary"}))) { syslog ('debug', "$group/$service/$periodlabel: Suppressing alert for now due to alertevery."); --- 670,686 ---- # # only alert once every "alertevery" seconds, unless ! # output from monitor is different or if strict alertevery # ! # strict and _ignore_summary are basically the same though ! # strict short-circuits and overrides other settings and exists ! # for compatibility with pre-1.1 configs ! # ! if ($pref->{"alertevery"} != 0 && # if alertevery is set and ! ($tmnow - $pref->{"_last_alert"} < $pref->{"alertevery"}) && # we're within the time period and one of these: ! (($pref->{"_alertevery_strict"}) || # [ strict is set or ! ($pref->{"_observe_detail"} && $sref->{"_failure_output"} eq $output) || # observing detail and output hasn't changed or ! (!$pref->{"_observe_detail"} && (!$pref->{"_ignore_summary"}) && ($prevsumm eq $summary)) || # not observing detail ! # and not ignoring summary and summ hasn't changed or ! ($pref->{"_ignore_summary"}))) # we're ignoring summary changes ] { syslog ('debug', "$group/$service/$periodlabel: Suppressing alert for now due to alertevery."); *************** *** 1451,1466 **** elsif ($var eq "alertevery") { ! my $observe_detail = 0; ! my $ignore_summary = 0; if ($args =~ /(\S+) \s+ observe_detail \s*$/ix) { ! $observe_detail = 1; $args = $1; } ! if ($args =~ /(\S+) \s+ ignore_summary \s*$/ix) { ! $ignore_summary = 1; $args = $1; } --- 1454,1470 ---- elsif ($var eq "alertevery") { ! $pref->{"_observe_detail"} = 0; ! $pref->{"_alertevery_strict"} = 0; ! $pref->{"_ignore_summary"} = 0; if ($args =~ /(\S+) \s+ observe_detail \s*$/ix) { ! $pref->{"_observe_detail"} = 1; $args = $1; } ! elsif ($args =~ /(\S+) \s+ ignore_summary \s*$/ix) { ! $pref->{"_ignore_summary"} = 1; $args = $1; } *************** *** 1474,1485 **** } if (!($args = dhmstos ($args))) { close (CFG); ! return "cf error: invalid time interval '$args' (syntax: alertevery {positive number}{smhd}), line $line_num"; } $pref->{"alertevery"} = $args; - $pref->{"_observe_detail"} = $observe_detail; - $pref->{"_ignore_summary"} = $ignore_summary; next; } --- 1478,1496 ---- } + # + # strict + # + elsif ($args =~ /(\S+) \s+ strict \s*$/ix) + { + $pref->{"_alertevery_strict"} = 1; + $args = $1; + } + if (!($args = dhmstos ($args))) { close (CFG); ! return "cf error: invalid time interval '$args' (syntax: alertevery {positive number}{smhd} [ strict | observe_detail | ignore_summary ]), line $line_num"; } $pref->{"alertevery"} = $args; next; } *************** *** 3275,3279 **** foreach my $period (keys %{$sref->{"periods"}}) { ! $sref->{"periods"}->{$period}->{"_last_alert"} = 0; $sref->{"periods"}->{$period}->{"_1stfailtime"} = 0; $sref->{"periods"}->{$period}->{"_alert_sent"} = 0; --- 3286,3297 ---- foreach my $period (keys %{$sref->{"periods"}}) { ! # ! # "alertevery strict" should not reset _last_alert ! # ! if (!$sref->{"periods"}->{$period}->{"_alertevery_strict"}) ! { ! $sref->{"periods"}->{$period}->{"_last_alert"} = 0; ! } ! $sref->{"periods"}->{$period}->{"_1stfailtime"} = 0; $sref->{"periods"}->{$period}->{"_alert_sent"} = 0; |
From: Jim T. <tr...@us...> - 2004-06-24 19:12:08
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv21442 Modified Files: mon Log Message: fixed a bug which would prevent alerts or upalerts from being sent when call alerts is passed the "output" argument whose value is undef Index: mon =================================================================== RCS file: /cvsroot/mon/mon/mon,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** mon 24 Jun 2004 19:09:26 -0000 1.6 --- mon 24 Jun 2004 19:11:54 -0000 1.7 *************** *** 5014,5019 **** group service flags retval alert output ! )) { ! return (undef) if (!defined $args{$mandatory_arg}); } --- 5014,5024 ---- group service flags retval alert output ! )) ! { ! if (!exists $args{$mandatory_arg}) ! { ! debug (1, "returning from call_alert because of missing arg $mandatory_arg\n"); ! return (undef); ! } } |
From: Jim T. <tr...@us...> - 2004-06-24 19:09:36
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20879 Modified Files: mon Log Message: no_comp_alerts patch from Daniel Fenert <da...@fe...> Index: mon =================================================================== RCS file: /cvsroot/mon/mon/mon,v retrieving revision 1.5 retrieving revision 1.6 diff -C2 -d -r1.5 -r1.6 *** mon 14 Jun 2004 11:29:47 -0000 1.5 --- mon 24 Jun 2004 19:09:26 -0000 1.6 *************** *** 28,32 **** my $RCSID='$Id$'; my $AUTHOR='tr...@tr...'; ! my $RELEASE='$ProjectVersion: mon-0-99-3.47 $'; # --- 28,32 ---- my $RCSID='$Id$'; my $AUTHOR='tr...@tr...'; ! my $RELEASE='$Name$'; # *************** *** 637,640 **** --- 637,648 ---- } + # + # skip looping upalerts when "no_comp-alerts" set. + # + if ($pref->{"no_comp_alerts"} && ($flags & $FL_UPALERT) && ($pref->{"_no_comp_alerts_upalert_sent"}>0)) + { + next; + } + # # do this if we're not handling an upalert, startupalert, ackalert, or disablealert *************** *** 805,808 **** --- 813,828 ---- else { $pref->{"_alert_sent"}++; + + # + # reset _no_comp_alerts_upalert_sent counter - when service will be + # back up, upalert will be sent. + # + if ($pref->{"no_comp_alerts"}) { + $pref->{"_no_comp_alerts_upalert_sent"} = 0; + } + } + + if ($pref->{"no_comp_alerts"} && ($flags & $FL_UPALERT)) { + $pref->{"_no_comp_alerts_upalert_sent"}++; } } *************** *** 1389,1392 **** --- 1409,1413 ---- $pref->{"_alert_sent"} = 0; $pref->{"no_comp_alerts"} = 0; + $pref->{"_no_comp_alerts_upalert_sent"} = 0; @{$pref->{"alerts"}} = (); @{$pref->{"upalerts"}} = (); |
From: Jim T. <tr...@us...> - 2004-06-24 18:46:58
|
Update of /cvsroot/mon/mon/clients In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv15795 Modified Files: Tag: mon-1-0-0pre1 moncmd Log Message: added list of all possible client commands Index: moncmd =================================================================== RCS file: /cvsroot/mon/mon/clients/moncmd,v retrieving revision 1.1.1.1 retrieving revision 1.1.1.1.2.1 diff -C2 -d -r1.1.1.1 -r1.1.1.1.2.1 *** moncmd 9 Jun 2004 05:18:07 -0000 1.1.1.1 --- moncmd 24 Jun 2004 18:46:48 -0000 1.1.1.1.2.1 *************** *** 216,243 **** Valid commands are: ! quit ! reset [stopped] ! term ! list group "groupname" ! list disabled list alerthist list failurehist - list successes list failures list opstatus list pids list watch - stop - start loadstate ! savestate set "group" "service" "variable" "value" ! get "group" "service" "variable" ! disable service "group" "service" ! disable host "host" ["host"...] ! disable watch "watch" ! enable service "group" "service" ! enable host "host" ["host"...] ! enable watch "watch" EOF exit 0; --- 216,261 ---- Valid commands are: ! ack "watch" "service" comment ! checkauth cmd [args] ! clear "watch" "service" ! disable host "host" ["host"...] ! disable service "group" "service" ! disable watch "watch" ! dump ! enable host "host" ["host"...] ! enable service "group" "service" ! enable watch "watch" ! get "group" "service" "variable" list alerthist + list aliases + list aliasgroups + list deps + list descriptions + list disabled + list dtlog list failurehist list failures + list group "groupname" list opstatus list pids + list state + list successes + list warnings list watch loadstate ! protid ! quit ! reload ! reset [stopped] [keepstate] ! savestate disabled ! servertime set "group" "service" "variable" "value" ! start ! stop ! term ! test config ! test monitor "watch" "service" ! test {alert|startupalert|upalert} "watch" "service" "retval" "period" ! version EOF exit 0; |
From: Jim T. <tr...@us...> - 2004-06-24 14:18:40
|
Update of /cvsroot/mon/mon In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv24150 Modified Files: Tag: mon-1-0-0pre1 mon Log Message: no_comp_alerts patch from Daniel Fenert <da...@fe...> Index: mon =================================================================== RCS file: /cvsroot/mon/mon/mon,v retrieving revision 1.4.2.5 retrieving revision 1.4.2.6 diff -C2 -d -r1.4.2.5 -r1.4.2.6 *** mon 24 Jun 2004 14:02:49 -0000 1.4.2.5 --- mon 24 Jun 2004 14:18:25 -0000 1.4.2.6 *************** *** 572,575 **** --- 572,583 ---- # + # skip looping upalerts when "no_comp-alerts" set. + # + if ($pref->{"no_comp_alerts"} && ($flags & $FL_UPALERT) && ($pref->{"_no_comp_alerts_upalert_sent"}>0)) + { + next; + } + + # # do this if we're not handling an upalert or startupalert # *************** *** 721,725 **** --- 729,745 ---- else { $pref->{"_alert_sent"}++; + + # + # reset _no_comp_alerts_upalert_sent counter - when service will be + # back up, upalert will be sent. + # + if ($pref->{"no_comp_alerts"}) { + $pref->{"_no_comp_alerts_upalert_sent"} = 0; + } } + + if ($pref->{"no_comp_alerts"} && ($flags & $FL_UPALERT)) { + $pref->{"_no_comp_alerts_upalert_sent"}++; + } } } *************** *** 1246,1249 **** --- 1266,1270 ---- $pref->{"_alert_sent"} = 0; $pref->{"no_comp_alerts"} = 0; + $pref->{"_no_comp_alerts_upalert_sent"} = 0; @{$pref->{"alerts"}} = (); @{$pref->{"upalerts"}} = (); |