mon-devel Mailing List for mon (Page 5)
Brought to you by:
trockij
You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
(13) |
Aug
(6) |
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
|
Feb
(27) |
Mar
|
Apr
(9) |
May
(11) |
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
(15) |
2006 |
Jan
|
Feb
(6) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2007 |
Jan
|
Feb
|
Mar
(14) |
Apr
(4) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(6) |
Nov
(4) |
Dec
(8) |
2008 |
Jan
(6) |
Feb
(4) |
Mar
(7) |
Apr
|
May
|
Jun
(2) |
Jul
(1) |
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
2009 |
Jan
(1) |
Feb
(1) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(3) |
2010 |
Jan
(11) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(7) |
Nov
(7) |
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
(1) |
Dec
|
2013 |
Jan
|
Feb
(3) |
Mar
|
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(1) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: David N. <vit...@cm...> - 2007-03-08 05:03:50
|
On 3/7/07, Augie Schwer <aug...@gm...> wrote: > It seems that mon's ack system is flawed at least if it is used how > the examples show multiple hosts in a watch group. > > Ack'ing an alert acks the service in the group, not the host in the > group that is alerting, so for example if you have a host group for > your web servers and you watch http; if http alerts on one of the > hosts and you ack it, the rest of your web servers could go down and > you would never know about it because the other host's http alerts > would be suppressed. > > Is this expected behavior? Am I wrong to think that this is a flaw? You are correct that the old mon 0.99.2 code exhibits this behavior. The more recent code in CVS has a configurable feature that causes mon to remove the ack state from a service if the summary component of the failure message changes. In most common usage the summary is the list of hosts that are failing, so additional hosts failing would remove an ack. There has also been some discussion in the past of adding true per-host status tracking to Mon, but that proposal has never been followed through on. (IIRC, we got bogged down in discussion of how we would need to add structure to the data communicated between mon and the monitor/alert scripts, and how to maintain backwards compatibility with existing scripts) > > I know the mon project is pretty much not maintained anymore, so if I > don't get any response back I won't be surprised, but I thought I > would float this question out there and see if I get any responses. While thats an understandable conclusion based on the lack of a stable release in approximately forver, there has been a lot of work since the last release. The lack of a (declared) stable release has been in part because of a lack of feedback on the development versions. In fact I posted a release candidate for mon 1.2.0 back in september (http://www.managedandmonitored.net/mon/) but I have received almost no feedback on this version. In many cases I assume the mon users just haven't had the opportunity to replace known-working systems or setup parallel monitoring infrastructure. -David |
From: Augie S. <aug...@gm...> - 2007-03-07 20:03:21
|
It seems that mon's ack system is flawed at least if it is used how the examples show multiple hosts in a watch group. Ack'ing an alert acks the service in the group, not the host in the group that is alerting, so for example if you have a host group for your web servers and you watch http; if http alerts on one of the hosts and you ack it, the rest of your web servers could go down and you would never know about it because the other host's http alerts would be suppressed. Is this expected behavior? Am I wrong to think that this is a flaw? It seems like the only way to use mon and not be bitten by this is to only have one host per host group. I know the mon project is pretty much not maintained anymore, so if I don't get any response back I won't be surprised, but I thought I would float this question out there and see if I get any responses. -- Augie Schwer - Augie@Schwer.us - http://schwer.us Key fingerprint = 9815 AE19 AFD1 1FE7 5DEE 2AC3 CB99 2784 27B0 C072 |
From: Peter W. \(MO/EMW\) <pet...@er...> - 2006-02-07 14:30:38
|
Just checked the monitors included in mon-1.1.0pre1.tar.gz % grep "use SNMP" * asyncreboot.monitor:use SNMP 1.8; cpqhealth.monitor:use SNMP; foundry-chassis.monitor:use SNMP; hpnp.monitor:use SNMP; na_quota.monitor:use SNMP; netappfree.monitor:use SNMP; process.monitor:use SNMP; reboot.monitor:use SNMP; silkworm.monitor:use SNMP; snmpvar.monitor:use SNMP; xedia-ipsec-tunnel.monitor:use SNMP; % grep "use Net::SNMP" * zip, zero nothing...but I can't find my netsnmp.monitor in the mon.d = directory either... ;-) So I might be better off installing the SNMP module in my network... /Peter=20 > -----Original Message----- > From: mon...@li... > [mailto:mon...@li...]On Behalf Of=20 > Peter Wirdemo > (MO/EMW) > Sent: den 7 februari 2006 11:45 > To: mon...@li... > Subject: RE: [Mon-devel] Re: Announcing ospf.monitor (beta-1 :-) >=20 >=20 > > -----Original Message----- > > From: Ed Ravin [mailto:er...@pa...] > > Sent: den 6 februari 2006 19:44 > > To: Jim Trocki > > Cc: Ed Ravin; Peter Wirdemo (MO/EMW);=20 > mon...@li... > > Subject: Re: [Mon-devel] Re: Announcing ospf.monitor (beta-1 :-) >=20 > The SNMP modules requires the Net-SNMP package to be=20 > installed, which is not required for MON. > From the README in SNMP-4.2.0 (which i got using CPAN) >=20 > SNMP module version 4.2.0 is being developed against NET-SNMP-4.2.0 > see http://sourceforge.net/projects/net-snmp for details. > =20 > Compatibility with earlier or later versions of Net-SNMP=20 > or UCD-SNMP > is not guaranteed due to the dynamic nature of open software > development :). >=20 > Net-SNMP package is now 5.3.0.1 ! >=20 > SNMP > The Perl5 'SNMP' Extension Module v3.1.0 for the UCD SNMPv3 Library > SNMP-4.2.0 - 12 Feb 2001 - Joe Marzot=20 >=20 > Net::SNMP > Object oriented interface to SNMP > Net-SNMP-5.2.0 - 20 Oct 2005 - David M. Town=20 >=20 > I dont know about stability, but I've nerver had problems=20 > using Net::SNMP >=20 > >=20 > >=20 > > On Mon, Feb 06, 2006 at 01:29:20PM -0500, Jim Trocki wrote: > > > The "heavy" SNMP includes the ability to parse MIBs, for=20 > one thing. > >=20 > > Which I deliberately didn't use to keep the monitor a bit simpler, > > since I only had a handful of OIDs to fetch. >=20 > I think most monitors are like this, get a few OIDS, not the whole MIB >=20 > >=20 > > > It also has asynchronous mode operation, which is very useful > > > for when you need to gather a bunch of tables from a number of > > > different hosts. > >=20 >=20 > Net::SNMP supports blocking and non-blocking mode! >=20 > > Now that's a good reason for using it - especially since=20 > one potential > > user of this script said he had 300 routers to poll. In my network > > it doesn't matter since (a) I have nowhere near that many=20 > routers and > > (b) for various other reasons, I make each router a=20 > separate entry in > > Mon. > >=20 > > > Long ago I embarked > > > on designing an OSPF monitor, and I realized that for it to=20 > > be efficient I > > > needed to take advantage of this asynch. capability, and I=20 > > wrote some > > > example code to rapidly fetch tables from dozens of snmp=20 > > agents in parallel. > > > I'll post that code here so that maybe someone could=20 > > integrate it into > > > Ed's OSPF monitor. > >=20 > > It would be nice to have a sample monitor that had working=20 > async SNMP > > fetching, since many other monitors would benefit from it. I've had > > trouble with some of my SNMP tests - they might work better=20 > if I gave > > them a longer timeout, but the serialization of the=20 > timeouts means the > > monitor might take longer than its 5 minute polling=20 > interval if enough > > hosts timed out. > >=20 > > -- Ed > >=20 >=20 >=20 > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep=20 > through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. =20 > DOWNLOAD SPLUNK! > http://sel.as-us.falkag.net/sel?cmd=3Dk&kid=103432&bid#0486&dat=121642 > _______________________________________________ > Mon-devel mailing list > Mon...@li... > https://lists.sourceforge.net/lists/listinfo/mon-devel >=20 |
From: Peter W. \(MO/EMW\) <pet...@er...> - 2006-02-07 10:46:35
|
> -----Original Message----- > From: Ed Ravin [mailto:er...@pa...] > Sent: den 6 februari 2006 19:44 > To: Jim Trocki > Cc: Ed Ravin; Peter Wirdemo (MO/EMW); mon...@li... > Subject: Re: [Mon-devel] Re: Announcing ospf.monitor (beta-1 :-) The SNMP modules requires the Net-SNMP package to be installed, which is = not required for MON. From the README in SNMP-4.2.0 (which i got using CPAN) SNMP module version 4.2.0 is being developed against NET-SNMP-4.2.0 see http://sourceforge.net/projects/net-snmp for details. =20 Compatibility with earlier or later versions of Net-SNMP or UCD-SNMP is not guaranteed due to the dynamic nature of open software development :). Net-SNMP package is now 5.3.0.1 ! SNMP The Perl5 'SNMP' Extension Module v3.1.0 for the UCD SNMPv3 Library SNMP-4.2.0 - 12 Feb 2001 - Joe Marzot=20 Net::SNMP Object oriented interface to SNMP Net-SNMP-5.2.0 - 20 Oct 2005 - David M. Town=20 I dont know about stability, but I've nerver had problems using = Net::SNMP >=20 >=20 > On Mon, Feb 06, 2006 at 01:29:20PM -0500, Jim Trocki wrote: > > The "heavy" SNMP includes the ability to parse MIBs, for one thing. >=20 > Which I deliberately didn't use to keep the monitor a bit simpler, > since I only had a handful of OIDs to fetch. I think most monitors are like this, get a few OIDS, not the whole MIB >=20 > > It also has asynchronous mode operation, which is very useful > > for when you need to gather a bunch of tables from a number of > > different hosts. >=20 Net::SNMP supports blocking and non-blocking mode! > Now that's a good reason for using it - especially since one potential > user of this script said he had 300 routers to poll. In my network > it doesn't matter since (a) I have nowhere near that many routers and > (b) for various other reasons, I make each router a separate entry in > Mon. >=20 > > Long ago I embarked > > on designing an OSPF monitor, and I realized that for it to=20 > be efficient I > > needed to take advantage of this asynch. capability, and I=20 > wrote some > > example code to rapidly fetch tables from dozens of snmp=20 > agents in parallel. > > I'll post that code here so that maybe someone could=20 > integrate it into > > Ed's OSPF monitor. >=20 > It would be nice to have a sample monitor that had working async SNMP > fetching, since many other monitors would benefit from it. I've had > trouble with some of my SNMP tests - they might work better if I gave > them a longer timeout, but the serialization of the timeouts means the > monitor might take longer than its 5 minute polling interval if enough > hosts timed out. >=20 > -- Ed >=20 |
From: Jim T. <tr...@ar...> - 2006-02-06 19:17:41
|
On Mon, 6 Feb 2006, Ed Ravin wrote: > It would be nice to have a sample monitor that had working async SNMP > fetching the asynch fetch code i wrote is here, and it should be easy to convert this into a sample monitor: http://arctic.org/~trockij/async-table and here are some old results that show you the performance of this puppy: ./async-table bd2-edge{1,2,3} bd2-agg{1,2} bd2-core{1,2,3,4} it's *fast*! i can get the ifTable (hundreds of rows each) from all those routers in a couple of seconds if do the query from a host on the LAN! this is the bandwidth utilization when doing the test via my 144K DSL line (read the first row as 55kbit/s input / 51 kbit/s output, "o" means output, "I" means input, the crude ascii plot shows the lower value superimposed over the larger value, "=" means the input+output values were equal): async table get: 2.00user 0.14system 0:11.46elapsed 18%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (444major+377minor)pagefaults 0swaps 01/08 23:13:44 ooooooooooooooooooI | 55 /51 01/08 23:13:46 oooooooooooooooooooooooooooooooooooooII | 114 /107 01/08 23:13:48 oooooooooooooooooooooooooooII | 82 /78 01/08 23:13:50 oooooooooooooooooooooooooooooI | 87 /85 01/08 23:13:52 ooooooooooooooooooooooooooooooI | 88 /85 01/08 23:13:54 oooooooooooooooooooI | 57 /55 sync table get: 2.26user 0.19system 0:35.42elapsed 6%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (440major+320minor)pagefaults 0swaps 01/08 23:13:58 ooI | 6 /5 01/08 23:14:00 ooooooooooI | 29 /29 01/08 23:14:02 ooooooooooI | 32 /29 01/08 23:14:00 ooooooooooI | 29 /29 01/08 23:14:02 ooooooooooI | 32 /29 01/08 23:14:04 =========== | 31 /30 01/08 23:14:06 =========== | 31 /30 01/08 23:14:08 =========== | 31 /29 01/08 23:14:10 ooooooooooI | 29 /29 01/08 23:14:12 =========== | 30 /29 01/08 23:14:14 =========== | 31 /30 01/08 23:14:16 =========== | 32 /29 01/08 23:14:18 =========== | 30 /29 01/08 23:14:20 ooooooooooI | 30 /29 01/08 23:14:22 =========== | 31 /30 01/08 23:14:24 =========== | 31 /30 01/08 23:14:26 ooooooooooI | 31 /29 01/08 23:14:28 oooooooooooI | 33 /29 01/08 23:14:30 ooooooooooI | 31 /27 01/08 23:14:32 ooooooooooI | 30 /27 01/08 23:14:34 | 0 /0 SIGNIFICANT improvement by doing the thing with the async API! |
From: Ed R. <er...@pa...> - 2006-02-06 18:43:39
|
On Mon, Feb 06, 2006 at 01:29:20PM -0500, Jim Trocki wrote: > The "heavy" SNMP includes the ability to parse MIBs, for one thing. Which I deliberately didn't use to keep the monitor a bit simpler, since I only had a handful of OIDs to fetch. > It also has asynchronous mode operation, which is very useful > for when you need to gather a bunch of tables from a number of > different hosts. Now that's a good reason for using it - especially since one potential user of this script said he had 300 routers to poll. In my network it doesn't matter since (a) I have nowhere near that many routers and (b) for various other reasons, I make each router a separate entry in Mon. > Long ago I embarked > on designing an OSPF monitor, and I realized that for it to be efficient I > needed to take advantage of this asynch. capability, and I wrote some > example code to rapidly fetch tables from dozens of snmp agents in parallel. > I'll post that code here so that maybe someone could integrate it into > Ed's OSPF monitor. It would be nice to have a sample monitor that had working async SNMP fetching, since many other monitors would benefit from it. I've had trouble with some of my SNMP tests - they might work better if I gave them a longer timeout, but the serialization of the timeouts means the monitor might take longer than its 5 minute polling interval if enough hosts timed out. -- Ed |
From: Jim T. <tr...@ar...> - 2006-02-06 18:29:28
|
On Mon, 6 Feb 2006, Ed Ravin wrote: > [cc'ing Mon developers since there's a system-wide issue here] > > On Mon, Feb 06, 2006 at 01:03:30PM +0100, Peter Wirdemo (MO/EMW) wrote: >> Just wondering, why use the rather "heavy" SNMP module (or is it >> just me that doesnt have it installed :-) > > Inertia - that's what bgp.monitor had been using, so I just copied it. > >> The Net::SNMP is more "Lightweighted" and easier to install or? >> >> For your convenience, I have converted your "ospf.monitor" to >> use the "Net::SNMP" module... The "heavy" SNMP includes the ability to parse MIBs, for one thing. It also has asynchronous mode operation, which is very useful for when you need to gather a bunch of tables from a number of different hosts. Long ago I embarked on designing an OSPF monitor, and I realized that for it to be efficient I needed to take advantage of this asynch. capability, and I wrote some example code to rapidly fetch tables from dozens of snmp agents in parallel. I'll post that code here so that maybe someone could integrate it into Ed's OSPF monitor. |
From: Ed R. <er...@pa...> - 2006-02-06 18:18:17
|
[cc'ing Mon developers since there's a system-wide issue here] On Mon, Feb 06, 2006 at 01:03:30PM +0100, Peter Wirdemo (MO/EMW) wrote: > Just wondering, why use the rather "heavy" SNMP module (or is it > just me that doesnt have it installed :-) Inertia - that's what bgp.monitor had been using, so I just copied it. > The Net::SNMP is more "Lightweighted" and easier to install or? > > For your convenience, I have converted your "ospf.monitor" to > use the "Net::SNMP" module... Thanks, I will try out your version and use that for the release if it works properly for me, since Net::SNMP is substantially newer than SNMP/SNMP_Session and I'll take your word for its other advantages. > It would be a nice survey to se which modules mon useras have > installed on their systems, could be a nice thing on the mon > homepage. We should probably standardize the supported monitors so they use the same SNMP modules whenever possible. Or at least recommend which modules to use for new monitors. > B.t.w good and nice monitor... Thanks! And I appreciate the feedback... -- Ed |
From: Todd L. <tl...@iv...> - 2005-12-21 17:58:43
|
On Sun, Dec 18, 2005 at 01:10:05AM -0500, Ed Ravin wrote: >the version you sent). See attached diffs - I fixed a couple of things >that turned up with "-w", typos in the names for the old MIB, some comments, >and a small sample of coding style things. Oh, and added an env >var for the community, every Mon script that uses communities should >support that (to keep the community name from turning up in the >mon.cgi details). Incorporated all your fixes. >Also, though it's not in the patch, I moved the duplicated array >declarations to the top next to the other globals, and it worked fine. >I'm using Perl 5.6.1, don't know what you have. <sigh> Yes, I wasn't paying attention and I had put the array declaration after the list() function call. Doesn't take a genius to figure out what *THAT* didn't work. >You can provide a label argument to the 'last' statement that points >to where you want to go. Still as a TODO. >I don't like "Rebuilding: 0%" as a status output - I first thought that >the filer was rebuilding the RAID and it's so slow it hasn't even gotten >to 1% yet. You should only add the "Rebuilding" tag if the status shows >that the filer is reconstructing the volume. Done. It shows "normal" now when the value is zero. On Sun, Dec 18, 2005 at 11:56:58AM -0500, Ed Ravin wrote: >filer ONTAP Volume Name Vol State Vol Status >--------------------------------------------------------------------------- >trantor 6.1.2R3 parity disk 8.30 active Rebuilding: 0% >trantor 6.1.2R3 data disk 8.31 active Rebuilding: 0% >trantor 6.1.2R3 data disk 8.23 reconstru Rebuilding: 82% <snip> >BTW, the filer is configured as two volumes, each with their own parity disk, >so it looks like the code needs to learn a bit more. Here's the MIB dump >that shows the different volumes: > >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.1 = 1 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.2 = 1 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.3 = 1 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.4 = 1 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.5 = 1 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.6 = 1 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.1 = 2 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.2 = 2 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.3 = 2 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.4 = 2 >enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.5 = 2 I think that was the reason why I originally used raidVPlexName instead of raidVDiskName, but both pieces of information are uniquely useful in their own right. A bit more processing could possibly make it so that it prints out a single line for a non-critical array, and only the lines of the reconstructing/failed/offline drives when there is a critical array. That would mean that I need to query both and do a bit of processing first. On Sun, Dec 18, 2005 at 12:38:41PM -0500, Ed Ravin wrote: >I hurriedly put netappraidstat.monitor into production and noticed that >it didn't support this filer: > NetApp Release 5.2.3P1: Wed Jan 12 11:15:32 PST 2000 Yes, it stops checking below 6.0. If the raidVTable exists in that MIB, then it could be added to the regex that checks the version numbers. But reading on... >And the filer helpfully had an empty sysDescr string, further confusing >things. The problem is that it didn't support raidVVol - this filer is If we can't get the ONTAP version, then we can't use it. So I'd say that we should not try to make it use that old a version (though arguably that old a version is the one most in need of monitoring). >so old, it only has one volume, thus no MIB entry for it. No big deal, >that filer should be in a museum and we're decomissioning it soon. >However, I did notice this fallout from adding -w: > $ ./netappraidstat.monitor --forceold nonexistent > Use of uninitialized value in concatenation (.) or string at > ./netappraidstat.monitor line 96. > 96 push (@ERRS, "could not create session to $host: " . $SNMP::Session::ErrorStr); Fixed. Ed, thanks for the testing and suggestions. This version has your fixes in it. When you feel this is suitable for production, let me know and we'll submit it to Jim for inclusion into cvs--it's not currently in HEAD as far as I can see, and I can think of no reason why it would have been put in the 1.0 branch. Please reference the attached script (not a patch, it's the full script). -- Regards... Todd OS X: We've been fighting the "It's a mac" syndrome with upper management for years now. Lately we've taken to just referring to new mac installations as "Unix" installations when presenting proposals and updates. For some reason, they have no problem with that. -- /. Linux kernel 2.6.12-12mdksmp 3 users, load average: 0.13, 0.09, 0.09 |
From: Ed R. <er...@pa...> - 2005-12-19 22:54:54
|
On Mon, Dec 19, 2005 at 01:55:17PM -0800, Todd Lyons wrote: ... > do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); > > The hashref $sref->{"_upalertoutput"} is not defined anywhere in the > code that I can find. It seems that there should be something like: > > $sref->{"_upalertoutput"} = $output; I reported this when I first discovered it several months ago, and again this past October 12 to the mon list when folks were talking about releasing Mon 1.1 - I guess you weren't subscribed yet. Here's my fix, unsurprisingly similar to yours: @@ -3295,6 +3296,8 @@ (!defined($sref->{"upalertafter"}) || (($tmnow - $sref->{"_first_failure"}) >= $sref->{"upalertafter"})))) { + # Save the last failing monitor's output for posterity + $sref->{"_upalertoutput"}= $sref->{"_last_output"}; do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); } I don't remember if this made it into CVS. I spent a bit of time reading the code and thinking about where to put the one-line fix, and it's been working flawlessly in production here ever since, so you might want to use my patch instead. -- Ed ---------- > The following patch works for me, but without a deeper understanding of > the code, I like to have someone with mon internals experience tell me > if I'm doing it the right way, or if it should just be passing $output > to the do_alert() function, or if there is something else going on that > I'm not seeing. > > > > --- /usr/sbin/mon.orig 2005-12-19 13:51:41.000000000 -0800 > +++ /usr/sbin/mon 2005-12-19 13:47:22.000000000 -0800 > @@ -3332,6 +3332,8 @@ > my $old_status = $sref->{"_op_status"}; > set_op_status ($group, $service, $STAT_OK); > > + $sref->{"_upalertoutput"} = $output; > + > if ($type eq "t") > { > $sref->{"_last_uptrap"} = $tmnow; > @@ -3350,6 +3352,7 @@ > || (($tmnow - $sref->{"_first_failure"}) >= $sref->{"upalertafter"})))) > { > do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); > + do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); > } > > # > > -- |
From: Todd L. <tl...@iv...> - 2005-12-19 21:55:27
|
Hi, I've been tracking down an empty upalert problem and think I've found it. I have a question though. This is using cvs 1.1.0-pre2 of mon/mon. In sub process_event(), if the monitor script exits with any value, it sends an alert on line 3250: do_alert ($group, $service, $output, $exitval, $FL_MONITOR); If the script exits with an exit value of 0, then it sends an alert on line 3315: do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); The hashref $sref->{"_upalertoutput"} is not defined anywhere in the code that I can find. It seems that there should be something like: $sref->{"_upalertoutput"} = $output; The following patch works for me, but without a deeper understanding of the code, I like to have someone with mon internals experience tell me if I'm doing it the right way, or if it should just be passing $output to the do_alert() function, or if there is something else going on that I'm not seeing. --- /usr/sbin/mon.orig 2005-12-19 13:51:41.000000000 -0800 +++ /usr/sbin/mon 2005-12-19 13:47:22.000000000 -0800 @@ -3332,6 +3332,8 @@ my $old_status = $sref->{"_op_status"}; set_op_status ($group, $service, $STAT_OK); + $sref->{"_upalertoutput"} = $output; + if ($type eq "t") { $sref->{"_last_uptrap"} = $tmnow; @@ -3350,6 +3352,7 @@ || (($tmnow - $sref->{"_first_failure"}) >= $sref->{"upalertafter"})))) { do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); + do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALERT); } # -- Regards... Todd we're off on the usual strange tangents. next will be whether it is ethical to walk in your neighbor's open house if they're running ipv6:-). --Randy Bush Linux kernel 2.6.12-12mdksmp 2 users, load average: 1.57, 1.42, 1.30 |
From: Ed R. <er...@pa...> - 2005-12-18 17:38:47
|
On Sun, Dec 18, 2005 at 11:56:58AM -0500, Ed Ravin wrote: > One of our filers noticed that I was testing the RAID status and > helpfully failed a disk and began rebuilding. I now get these > statuses: > > $ ./netappraidstat.monitor --config /etc/mon/netappfree.cf toaster I hurriedly put netappraidstat.monitor into production and noticed that it didn't support this filer: NetApp Release 5.2.3P1: Wed Jan 12 11:15:32 PST 2000 And the filer helpfully had an empty sysDescr string, further confusing things. The problem is that it didn't support raidVVol - this filer is so old, it only has one volume, thus no MIB entry for it. No big deal, that filer should be in a museum and we're decomissioning it soon. However, I did notice this fallout from adding -w: $ ./netappraidstat.monitor --forceold nonexistent Use of uninitialized value in concatenation (.) or string at ./netappraidstat.monitor line 96. nonexistent 91 if (!defined($s = new SNMP::Session (DestHost => $host, 92 Timeout => $TIMEOUT, Community => $COMM, 93 Retries => $RETRIES, Version => $SNMPVERSION))) { 94 $RET = ($RET == 1) ? 1 : 2; 95 $HOSTS{$host} ++; 96 push (@ERRS, "could not create session to $host: " . $SNMP::Session::ErrorStr); 97 next; 98 } Looking at the code of SNMP.pm, it looks like ErrorStr is not defined if new() fails - other mon scripts I've looked at don't even try to get ErrorStr at this stage. |
From: Ed R. <er...@pa...> - 2005-12-18 16:57:08
|
One of our filers noticed that I was testing the RAID status and helpfully failed a disk and began rebuilding. I now get these statuses: $ ./netappraidstat.monitor --config /etc/mon/netappfree.cf toaster toaster toaster is reconstruct, status: 'Rebuilding: 82%' And the --list option shows this: filer ONTAP Volume Name Vol State Vol Status --------------------------------------------------------------------------- trantor 6.1.2R3 parity disk 8.30 active Rebuilding: 0% trantor 6.1.2R3 data disk 8.31 active Rebuilding: 0% trantor 6.1.2R3 data disk 8.23 reconstru Rebuilding: 82% trantor 6.1.2R3 data disk 8.21 active Rebuilding: 0% trantor 6.1.2R3 data disk 8.22 active Rebuilding: 0% trantor 6.1.2R3 data disk 8.24 active Rebuilding: 0% trantor 6.1.2R3 parity disk 8.26 active Rebuilding: 0% trantor 6.1.2R3 data disk 8.25 active Rebuilding: 0% trantor 6.1.2R3 data disk 8.27 active Rebuilding: 0% trantor 6.1.2R3 data disk 8.28 active Rebuilding: 0% trantor 6.1.2R3 data disk 8.29 active Rebuilding: 0% BTW, the filer is configured as two volumes, each with their own parity disk, so it looks like the code needs to learn a bit more. Here's the MIB dump that shows the different volumes: enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.1 = 1 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.2 = 1 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.3 = 1 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.4 = 1 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.5 = 1 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.1.6 = 1 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.1 = 2 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.2 = 2 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.3 = 2 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.4 = 2 enterprises.netapp.netapp1.raid.raidVTable.raidVEntry.raidVGroup.1.2.5 = 2 ----------------- [output below from my previous note, when the filer was in normal status] > Here's the output of --list on my old NetAPP: > $ ./netappraidstat.monitor --list --config /etc/mon/netappfree.cf toaster > filer ONTAP Volume Name Vol State Vol Status > --------------------------------------------------------------------------- > toaster 6.1.2R3 parity disk 8.30 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.31 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.23 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.21 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.22 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.24 active Rebuilding: 0% > toaster 6.1.2R3 parity disk 8.26 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.25 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.27 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.28 active Rebuilding: 0% > toaster 6.1.2R3 data disk 8.29 active Rebuilding: 0% > |
From: Ed R. <er...@pa...> - 2005-12-18 06:10:08
|
On Thu, Dec 15, 2005 at 01:31:58PM -0800, Todd Lyons wrote: > For ONTAP 6.0 and 6.1, I can generate the appropriate values from the > raidVTable with the same Volume Status limitation, but it prints a line > out for every drive. Someone with a real 6.0 or 6.1 ONTAP system > (<cough> Ed <cough>) ought to be able to figure out if the fields that > I'm using will be duplicated across each drive. That's if he can get the script working. Which he could, but it needed a bit of hacking (I don't see how the "list" option worked at all in the version you sent). See attached diffs - I fixed a couple of things that turned up with "-w", typos in the names for the old MIB, some comments, and a small sample of coding style things. Oh, and added an env var for the community, every Mon script that uses communities should support that (to keep the community name from turning up in the mon.cgi details). Also, though it's not in the patch, I moved the duplicated array declarations to the top next to the other globals, and it worked fine. I'm using Perl 5.6.1, don't know what you have. > If yes, then you only > keep one line by doing a 'next' or 'last' or set a flag to get out of > the foreach loop. I think that there will need to be some trickery to > get out of both the inner while and the outer foreach loops though. You can provide a label argument to the 'last' statement that points to where you want to go. > The new script is attached. [...] I don't like "Rebuilding: 0%" as a status output - I first thought that the filer was rebuilding the RAID and it's so slow it hasn't even gotten to 1% yet. You should only add the "Rebuilding" tag if the status shows that the filer is reconstructing the volume. Here's the output of --list on my old NetAPP: $ ./netappraidstat.monitor --list --config /etc/mon/netappfree.cf toaster filer ONTAP Volume Name Vol State Vol Status --------------------------------------------------------------------------- toaster 6.1.2R3 parity disk 8.30 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.31 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.23 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.21 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.22 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.24 active Rebuilding: 0% toaster 6.1.2R3 parity disk 8.26 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.25 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.27 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.28 active Rebuilding: 0% toaster 6.1.2R3 data disk 8.29 active Rebuilding: 0% |
From: Todd L. <tl...@iv...> - 2005-12-16 17:30:37
|
Ed Ravin wanted us to know: >> >Doess't work for me - "volIndex" is not in my copy of the NetAPP MIB: >> > -- Version 1.5, May 2000 >> >I have the raidVTable and the deprecated raidTable in that MIB, but nothing >> >in the "vol" group. We're using ONTAP 6.1.2R3 (yes, I know). >> ...a couple hours of coding and testing... >> For ONTAP 6.2 and 6.3, I can generate the appropriate values with the >> slight alteration of the Volume Status showing the Rebuild percent. It >> will show 0% in normal operation. Is that a reasonable thing to do? I could add some logic that prints "no problems" or similar if the value is "Rebuilding: 0%". I'm unsure if this output changing as the rebuild percentage increments up will cause an alert for each change. I'm not a mon guru, I just started using it last week, so I'm not real solid yet on the alert and upalert functionality. I'm doing testing right now to see if I can have multiple alerts and upalerts per service. It seems like it should be able to do it, but I'm only getting the email an alert generates, not the pages. It could be other things though, so have to work through that first. >> For ONTAP 6.0 and 6.1, I can generate the appropriate values from the >> raidVTable with the same Volume Status limitation, but it prints a line >> out for every drive. Someone with a real 6.0 or 6.1 ONTAP system >> (<cough> Ed <cough>) ought to be able to figure out if the fields that >> I'm using will be duplicated across each drive. >Todd, thanks for all the coding! I'm in the middle of a heavy operation >at work, won't be able to get to this until late next week, will report >back then. /me stamps foot... Just kidding. Yes, I have a vested interest in making this work as well. > >On a related note, is anyone monitoring their NetApp fans and other >environmental info via SNMP? I'm not, but I'd be willing to bet that another script could be modeled after this one and do the same checks and output. I'm also wondering if it would just be better to have a single netapp script that you can specify which functions you want monitored. Just a thought... -- Regards... Todd we're off on the usual strange tangents. next will be whether it is ethical to walk in your neighbor's open house if they're running ipv6:-). --Randy Bush Linux kernel 2.6.12-12mdksmp 2 users, load average: 0.18, 0.09, 0.17 |
From: Ed R. <er...@pa...> - 2005-12-16 00:09:36
|
On Thu, Dec 15, 2005 at 01:31:58PM -0800, Todd Lyons wrote: > Ed Ravin wanted us to know: > > >Doess't work for me - "volIndex" is not in my copy of the NetAPP MIB: > > -- Version 1.5, May 2000 > >I have the raidVTable and the deprecated raidTable in that MIB, but nothing > >in the "vol" group. We're using ONTAP 6.1.2R3 (yes, I know). ... > ...a couple hours of coding and testing... > > For ONTAP 6.2 and 6.3, I can generate the appropriate values with the > slight alteration of the Volume Status showing the Rebuild percent. It > will show 0% in normal operation. > > For ONTAP 6.0 and 6.1, I can generate the appropriate values from the > raidVTable with the same Volume Status limitation, but it prints a line > out for every drive. Someone with a real 6.0 or 6.1 ONTAP system > (<cough> Ed <cough>) ought to be able to figure out if the fields that > I'm using will be duplicated across each drive. Todd, thanks for all the coding! I'm in the middle of a heavy operation at work, won't be able to get to this until late next week, will report back then. On a related note, is anyone monitoring their NetApp fans and other environmental info via SNMP? -- Ed |
From: Todd L. <tl...@iv...> - 2005-12-15 21:32:14
|
Ed Ravin wanted us to know: >Doess't work for me - "volIndex" is not in my copy of the NetAPP MIB: > -- Version 1.5, May 2000 >I have the raidVTable and the deprecated raidTable in that MIB, but nothing >in the "vol" group. We're using ONTAP 6.1.2R3 (yes, I know). Looking in the various NetApp MIBs, I see the following: 1) volTable exists for ONTAP 6.4 through 7.0, which is what this script was written for. 2) plexTable exists for ONTAP 6.2 through 7.0, which seems to give similarly useful information. 3) raidVTable exists for ONTAP 6.0 through 7.0, it could be mostly generated from that as well. ...a couple hours of coding and testing... For ONTAP 6.2 and 6.3, I can generate the appropriate values with the slight alteration of the Volume Status showing the Rebuild percent. It will show 0% in normal operation. For ONTAP 6.0 and 6.1, I can generate the appropriate values from the raidVTable with the same Volume Status limitation, but it prints a line out for every drive. Someone with a real 6.0 or 6.1 ONTAP system (<cough> Ed <cough>) ought to be able to figure out if the fields that I'm using will be duplicated across each drive. If yes, then you only keep one line by doing a 'next' or 'last' or set a flag to get out of the foreach loop. I think that there will need to be some trickery to get out of both the inner while and the outer foreach loops though. The new script is attached. There are some new commandline options available. The --forceold forces it to use the old raidVTable MIB. The --forceplex forces it to use the plexTable MIB. Not using either option allows it to auto-detect which version of ONTAP it's connecting to and uses the appropriate one (volTable, plexTable, and raidVTable, preference is in that order). There is also a --debug option which will print out the same info as if there were a problem detected. The force* commandline options are intended more for troubleshooting than normal operation, but they are available in both modes (normal and list). If need be, some documentation can be added in the comment section at the top of the script to explain them. Here's the output of the latest incarnation of the script for various methods of being called: admin51 mon.d # ./netappraidstat.monitor netapp1 netapp2 admin51 mon.d # ./netappraidstat.monitor --forceplex netapp1 netapp2 admin51 mon.d # ./netappraidstat.monitor --forceold netapp1 netapp2 admin51 mon.d # ./netappraidstat.monitor netapp1 netapp2 --debug netapp1 netapp2 netapp1 is online, status: 'raid4' netapp2 is online, status: 'raid4' admin51 mon.d # ./netappraidstat.monitor netapp1 netapp2 --debug --forceplex netapp1 netapp2 netapp1 is active, status: 'Rebuilding: 0%' netapp2 is active, status: 'Rebuilding: 0%' admin51 mon.d # ./netappraidstat.monitor --list netapp1 netapp2 filer ONTAP Volume Name Vol State Vol Status --------------------------------------------------------------------------- netapp1 6.5 vol0 online raid4 netapp2 6.5 vol0 online raid4 admin51 mon.d # ./netappraidstat.monitor --list --forceplex netapp1 netapp2 filer ONTAP Volume Name Vol State Vol Status --------------------------------------------------------------------------- netapp1 6.5 vol0 active Rebuilding: 0% netapp2 6.5 vol0 active Rebuilding: 0% admin51 mon.d # ./netappraidstat.monitor --list --forceold netapp1 netapp2 filer ONTAP Volume Name Vol State Vol Status --------------------------------------------------------------------------- netapp1 6.5 /vol0/plex0 active Rebuilding: 0% netapp1 6.5 /vol0/plex0 active Rebuilding: 0% netapp1 6.5 /vol0/plex0 active Rebuilding: 0% netapp1 6.5 /vol0/plex0 active Rebuilding: 0% netapp1 6.5 /vol0/plex0 active Rebuilding: 0% netapp1 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% netapp2 6.5 /vol0/plex0 active Rebuilding: 0% >I can hack it to use the raidVTable MIB, and it gives some useful info, >but I don't think I'm able to properly test it. For reference, here's the volTable output, useful for ONTAP 6.4, 6.5 and 7.0: [todd@tlyons ~]$ /usr/bin/snmpwalk -c webBuilder -v 1 -m /home/todd/netapp/mib_6.5/netapp.mib.txt netapp1 volTable NETWORK-APPLIANCE-MIB::volIndex.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::volName.1 = STRING: "vol0" NETWORK-APPLIANCE-MIB::volFSID.1 = STRING: "3120846801" NETWORK-APPLIANCE-MIB::volOwningHost.1 = INTEGER: local(1) NETWORK-APPLIANCE-MIB::volState.1 = STRING: "online" NETWORK-APPLIANCE-MIB::volStatus.1 = STRING: "raid4" NETWORK-APPLIANCE-MIB::volOptions.1 = STRING: "root, diskroot, nosnap=off, nosnapdir=off, minra=off, no_atime_update=off, raidtype=raid4, raidsize=8, nvfail=off, snapmirrored=off, resyncsnaptime=60, create_ucode=on, convert_ucode=off, maxdirsize=10470, fs_size_fixed=off, create_reserved=off" NETWORK-APPLIANCE-MIB::volUUID.1 = STRING: "censored" And here's the plexTable output, useful for ONTAP 6.2 and 6.3: [todd@tlyons ~/netapp]$ /usr/bin/snmpwalk -c webBuilder -v 1 -m /home/todd/netapp/mib_6.5/netapp.mib.txt netapp1 plexTable NETWORK-APPLIANCE-MIB::plexIndex.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::plexName.1 = STRING: "/vol0/plex0" NETWORK-APPLIANCE-MIB::plexVolName.1 = STRING: "vol0" NETWORK-APPLIANCE-MIB::plexStatus.1 = INTEGER: online(3) NETWORK-APPLIANCE-MIB::plexPercentResyncing.1 = INTEGER: 0 For the name, I didn't know if I should use plexName or plexVolName, so I used plexVolName. Feel free to advise me one way or the other on that. And here's the raidVTable output, useful for ONTAP 6.0 and 6.1: [todd@tlyons ~/netapp]$ /usr/bin/snmpwalk -c webBuilder -v 1 -m /home/todd/netapp/mib_6.5/netapp.mib.txt netapp0 raidVTable NETWORK-APPLIANCE-MIB::raidVIndex.1.1.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVIndex.1.1.2 = INTEGER: 2 NETWORK-APPLIANCE-MIB::raidVIndex.1.1.3 = INTEGER: 3 NETWORK-APPLIANCE-MIB::raidVIndex.1.1.4 = INTEGER: 4 NETWORK-APPLIANCE-MIB::raidVIndex.1.1.5 = INTEGER: 5 NETWORK-APPLIANCE-MIB::raidVIndex.1.1.6 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVDiskName.1.1.1 = STRING: "data disk 7.4" NETWORK-APPLIANCE-MIB::raidVDiskName.1.1.2 = STRING: "data disk 7.1" NETWORK-APPLIANCE-MIB::raidVDiskName.1.1.3 = STRING: "parity disk 7.0" NETWORK-APPLIANCE-MIB::raidVDiskName.1.1.4 = STRING: "data disk 7.2" NETWORK-APPLIANCE-MIB::raidVDiskName.1.1.5 = STRING: "data disk 7.5" NETWORK-APPLIANCE-MIB::raidVDiskName.1.1.6 = STRING: "dparity disk 7.6" NETWORK-APPLIANCE-MIB::raidVStatus.1.1.1 = INTEGER: active(1) NETWORK-APPLIANCE-MIB::raidVStatus.1.1.2 = INTEGER: active(1) NETWORK-APPLIANCE-MIB::raidVStatus.1.1.3 = INTEGER: active(1) NETWORK-APPLIANCE-MIB::raidVStatus.1.1.4 = INTEGER: active(1) NETWORK-APPLIANCE-MIB::raidVStatus.1.1.5 = INTEGER: active(1) NETWORK-APPLIANCE-MIB::raidVStatus.1.1.6 = INTEGER: active(1) NETWORK-APPLIANCE-MIB::raidVDiskId.1.1.1 = INTEGER: 393217 NETWORK-APPLIANCE-MIB::raidVDiskId.1.1.2 = INTEGER: 327681 NETWORK-APPLIANCE-MIB::raidVDiskId.1.1.3 = INTEGER: 262145 NETWORK-APPLIANCE-MIB::raidVDiskId.1.1.4 = INTEGER: 196609 NETWORK-APPLIANCE-MIB::raidVDiskId.1.1.5 = INTEGER: 65537 NETWORK-APPLIANCE-MIB::raidVDiskId.1.1.6 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVScsiAdapter.1.1.1 = STRING: "7" NETWORK-APPLIANCE-MIB::raidVScsiAdapter.1.1.2 = STRING: "7" NETWORK-APPLIANCE-MIB::raidVScsiAdapter.1.1.3 = STRING: "7" NETWORK-APPLIANCE-MIB::raidVScsiAdapter.1.1.4 = STRING: "7" NETWORK-APPLIANCE-MIB::raidVScsiAdapter.1.1.5 = STRING: "7" NETWORK-APPLIANCE-MIB::raidVScsiAdapter.1.1.6 = STRING: "7" NETWORK-APPLIANCE-MIB::raidVScsiId.1.1.1 = INTEGER: 4 NETWORK-APPLIANCE-MIB::raidVScsiId.1.1.2 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVScsiId.1.1.3 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVScsiId.1.1.4 = INTEGER: 2 NETWORK-APPLIANCE-MIB::raidVScsiId.1.1.5 = INTEGER: 5 NETWORK-APPLIANCE-MIB::raidVScsiId.1.1.6 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVUsedMb.1.1.1 = INTEGER: 16979 NETWORK-APPLIANCE-MIB::raidVUsedMb.1.1.2 = INTEGER: 16979 NETWORK-APPLIANCE-MIB::raidVUsedMb.1.1.3 = INTEGER: 16979 NETWORK-APPLIANCE-MIB::raidVUsedMb.1.1.4 = INTEGER: 16979 NETWORK-APPLIANCE-MIB::raidVUsedMb.1.1.5 = INTEGER: 16979 NETWORK-APPLIANCE-MIB::raidVUsedMb.1.1.6 = INTEGER: 16979 NETWORK-APPLIANCE-MIB::raidVUsedBlocks.1.1.1 = INTEGER: 34774016 NETWORK-APPLIANCE-MIB::raidVUsedBlocks.1.1.2 = INTEGER: 34774016 NETWORK-APPLIANCE-MIB::raidVUsedBlocks.1.1.3 = INTEGER: 34774016 NETWORK-APPLIANCE-MIB::raidVUsedBlocks.1.1.4 = INTEGER: 34774016 NETWORK-APPLIANCE-MIB::raidVUsedBlocks.1.1.5 = INTEGER: 34774016 NETWORK-APPLIANCE-MIB::raidVUsedBlocks.1.1.6 = INTEGER: 34774016 NETWORK-APPLIANCE-MIB::raidVTotalMb.1.1.1 = INTEGER: 17366 NETWORK-APPLIANCE-MIB::raidVTotalMb.1.1.2 = INTEGER: 17366 NETWORK-APPLIANCE-MIB::raidVTotalMb.1.1.3 = INTEGER: 17366 NETWORK-APPLIANCE-MIB::raidVTotalMb.1.1.4 = INTEGER: 17366 NETWORK-APPLIANCE-MIB::raidVTotalMb.1.1.5 = INTEGER: 17366 NETWORK-APPLIANCE-MIB::raidVTotalMb.1.1.6 = INTEGER: 17366 NETWORK-APPLIANCE-MIB::raidVTotalBlocks.1.1.1 = INTEGER: 35566480 NETWORK-APPLIANCE-MIB::raidVTotalBlocks.1.1.2 = INTEGER: 35566480 NETWORK-APPLIANCE-MIB::raidVTotalBlocks.1.1.3 = INTEGER: 35566480 NETWORK-APPLIANCE-MIB::raidVTotalBlocks.1.1.4 = INTEGER: 35566480 NETWORK-APPLIANCE-MIB::raidVTotalBlocks.1.1.5 = INTEGER: 35566480 NETWORK-APPLIANCE-MIB::raidVTotalBlocks.1.1.6 = INTEGER: 35566480 NETWORK-APPLIANCE-MIB::raidVCompletionPerCent.1.1.1 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVCompletionPerCent.1.1.2 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVCompletionPerCent.1.1.3 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVCompletionPerCent.1.1.4 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVCompletionPerCent.1.1.5 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVCompletionPerCent.1.1.6 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVVol.1.1.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVVol.1.1.2 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVVol.1.1.3 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVVol.1.1.4 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVVol.1.1.5 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVVol.1.1.6 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroup.1.1.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroup.1.1.2 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroup.1.1.3 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroup.1.1.4 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroup.1.1.5 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroup.1.1.6 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVDiskNumber.1.1.1 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVDiskNumber.1.1.2 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVDiskNumber.1.1.3 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVDiskNumber.1.1.4 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVDiskNumber.1.1.5 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVDiskNumber.1.1.6 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVGroupNumber.1.1.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroupNumber.1.1.2 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroupNumber.1.1.3 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroupNumber.1.1.4 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroupNumber.1.1.5 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVGroupNumber.1.1.6 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVDiskPort.1.1.1 = INTEGER: portA(1) NETWORK-APPLIANCE-MIB::raidVDiskPort.1.1.2 = INTEGER: portA(1) NETWORK-APPLIANCE-MIB::raidVDiskPort.1.1.3 = INTEGER: portA(1) NETWORK-APPLIANCE-MIB::raidVDiskPort.1.1.4 = INTEGER: portA(1) NETWORK-APPLIANCE-MIB::raidVDiskPort.1.1.5 = INTEGER: portA(1) NETWORK-APPLIANCE-MIB::raidVDiskPort.1.1.6 = INTEGER: portA(1) NETWORK-APPLIANCE-MIB::raidVSecondaryDiskName.1.1.1 = "" NETWORK-APPLIANCE-MIB::raidVSecondaryDiskName.1.1.2 = "" NETWORK-APPLIANCE-MIB::raidVSecondaryDiskName.1.1.3 = "" NETWORK-APPLIANCE-MIB::raidVSecondaryDiskName.1.1.4 = "" NETWORK-APPLIANCE-MIB::raidVSecondaryDiskName.1.1.5 = "" NETWORK-APPLIANCE-MIB::raidVSecondaryDiskName.1.1.6 = "" NETWORK-APPLIANCE-MIB::raidVSecondaryDiskPort.1.1.1 = INTEGER: portNone(4) NETWORK-APPLIANCE-MIB::raidVSecondaryDiskPort.1.1.2 = INTEGER: portNone(4) NETWORK-APPLIANCE-MIB::raidVSecondaryDiskPort.1.1.3 = INTEGER: portNone(4) NETWORK-APPLIANCE-MIB::raidVSecondaryDiskPort.1.1.4 = INTEGER: portNone(4) NETWORK-APPLIANCE-MIB::raidVSecondaryDiskPort.1.1.5 = INTEGER: portNone(4) NETWORK-APPLIANCE-MIB::raidVSecondaryDiskPort.1.1.6 = INTEGER: portNone(4) NETWORK-APPLIANCE-MIB::raidVShelf.1.1.1 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVShelf.1.1.2 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVShelf.1.1.3 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVShelf.1.1.4 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVShelf.1.1.5 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVShelf.1.1.6 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVBay.1.1.1 = INTEGER: 4 NETWORK-APPLIANCE-MIB::raidVBay.1.1.2 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVBay.1.1.3 = INTEGER: 0 NETWORK-APPLIANCE-MIB::raidVBay.1.1.4 = INTEGER: 2 NETWORK-APPLIANCE-MIB::raidVBay.1.1.5 = INTEGER: 5 NETWORK-APPLIANCE-MIB::raidVBay.1.1.6 = INTEGER: 6 NETWORK-APPLIANCE-MIB::raidVPlex.1.1.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlex.1.1.2 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlex.1.1.3 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlex.1.1.4 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlex.1.1.5 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlex.1.1.6 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexGroup.1.1.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexGroup.1.1.2 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexGroup.1.1.3 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexGroup.1.1.4 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexGroup.1.1.5 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexGroup.1.1.6 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexNumber.1.1.1 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexNumber.1.1.2 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexNumber.1.1.3 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexNumber.1.1.4 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexNumber.1.1.5 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexNumber.1.1.6 = INTEGER: 1 NETWORK-APPLIANCE-MIB::raidVPlexName.1.1.1 = STRING: "/vol0/plex0" NETWORK-APPLIANCE-MIB::raidVPlexName.1.1.2 = STRING: "/vol0/plex0" NETWORK-APPLIANCE-MIB::raidVPlexName.1.1.3 = STRING: "/vol0/plex0" NETWORK-APPLIANCE-MIB::raidVPlexName.1.1.4 = STRING: "/vol0/plex0" NETWORK-APPLIANCE-MIB::raidVPlexName.1.1.5 = STRING: "/vol0/plex0" NETWORK-APPLIANCE-MIB::raidVPlexName.1.1.6 = STRING: "/vol0/plex0" NETWORK-APPLIANCE-MIB::raidVSectorSize.1.1.1 = INTEGER: 512 NETWORK-APPLIANCE-MIB::raidVSectorSize.1.1.2 = INTEGER: 512 NETWORK-APPLIANCE-MIB::raidVSectorSize.1.1.3 = INTEGER: 512 NETWORK-APPLIANCE-MIB::raidVSectorSize.1.1.4 = INTEGER: 512 NETWORK-APPLIANCE-MIB::raidVSectorSize.1.1.5 = INTEGER: 512 NETWORK-APPLIANCE-MIB::raidVSectorSize.1.1.6 = INTEGER: 512 NETWORK-APPLIANCE-MIB::raidVEntry.26.1.1.1 = STRING: "LKE378300000101639R9" NETWORK-APPLIANCE-MIB::raidVEntry.26.1.1.2 = STRING: "LKJ68162000010162L6L" NETWORK-APPLIANCE-MIB::raidVEntry.26.1.1.3 = STRING: "LKJ79075000010200C1S" NETWORK-APPLIANCE-MIB::raidVEntry.26.1.1.4 = STRING: "LKJ780870000101937T6" NETWORK-APPLIANCE-MIB::raidVEntry.26.1.1.5 = STRING: "LK53810700002923H8E4" NETWORK-APPLIANCE-MIB::raidVEntry.26.1.1.6 = STRING: "LKJ793720000101937R7" NETWORK-APPLIANCE-MIB::raidVEntry.27.1.1.1 = STRING: "SEAGATE " NETWORK-APPLIANCE-MIB::raidVEntry.27.1.1.2 = STRING: "SEAGATE " NETWORK-APPLIANCE-MIB::raidVEntry.27.1.1.3 = STRING: "SEAGATE " NETWORK-APPLIANCE-MIB::raidVEntry.27.1.1.4 = STRING: "SEAGATE " NETWORK-APPLIANCE-MIB::raidVEntry.27.1.1.5 = STRING: "SEAGATE " NETWORK-APPLIANCE-MIB::raidVEntry.27.1.1.6 = STRING: "SEAGATE " NETWORK-APPLIANCE-MIB::raidVEntry.28.1.1.1 = STRING: "ST118202FC " NETWORK-APPLIANCE-MIB::raidVEntry.28.1.1.2 = STRING: "ST118202FC " NETWORK-APPLIANCE-MIB::raidVEntry.28.1.1.3 = STRING: "ST118202FC " NETWORK-APPLIANCE-MIB::raidVEntry.28.1.1.4 = STRING: "ST118202FC " NETWORK-APPLIANCE-MIB::raidVEntry.28.1.1.5 = STRING: "ST118202FC " NETWORK-APPLIANCE-MIB::raidVEntry.28.1.1.6 = STRING: "ST118202FC " NETWORK-APPLIANCE-MIB::raidVEntry.29.1.1.1 = STRING: "NA27" NETWORK-APPLIANCE-MIB::raidVEntry.29.1.1.2 = STRING: "NA27" NETWORK-APPLIANCE-MIB::raidVEntry.29.1.1.3 = STRING: "NA27" NETWORK-APPLIANCE-MIB::raidVEntry.29.1.1.4 = STRING: "NA27" NETWORK-APPLIANCE-MIB::raidVEntry.29.1.1.5 = STRING: "NA27" NETWORK-APPLIANCE-MIB::raidVEntry.29.1.1.6 = STRING: "NA27" -- Regards... Todd We should not be building surveillance technology into standards. Law enforcement was not supposed to be easy. Where it is easy, it's called a police state. -- Jeff Schiller on NANOG Linux kernel 2.6.12-12mdksmp 2 users, load average: 0.00, 0.07, 0.06 |
From: Todd L. <tl...@iv...> - 2005-12-09 15:26:13
|
Ed Ravin wanted us to know: >Doess't work for me - "volIndex" is not in my copy of the NetAPP MIB: > -- Version 1.5, May 2000 >I have the raidVTable and the deprecated raidTable in that MIB, but nothing >in the "vol" group. We're using ONTAP 6.1.2R3 (yes, I know). :-) >I can hack it to use the raidVTable MIB, and it gives some useful info, >but I don't think I'm able to properly test it. Our ONTAP is 6.5. It looks like it will have to do some version checking and compensating before it can be accepted for general release. I'll grab the schema from NOW for the various ONTAP versions and dig through them to see 1) which ones support volIndex (or the volTable in general) 2) if anything else could be used in its stead Thanks for testing this for me. My hardware selection is very limited :-) -- Regards... Todd I've visited conferences where the wireless LAN was deemed "secure" by the organisation because they had outlawed sniffers. --Neils Bakker Linux kernel 2.6.12-12mdksmp 2 users, load average: 0.01, 0.01, 0.00 |
From: Ed R. <er...@pa...> - 2005-12-09 05:02:32
|
On Thu, Dec 08, 2005 at 04:00:15PM -0800, Todd Lyons wrote: > 1) It looks for netappraidstat.cf in one of two places: > /etc/mon/netappraidstat.cf or /usr/lib/mon/etc/netappraidstat.cf > The configfile format is one hostname per line. > 2) It's designed so that if you already have a netappfree.cf file, you > can just symlink netappraidstat.cf to it and it will work (it ignores > everything else on the line after the hostname. Doess't work for me - "volIndex" is not in my copy of the NetAPP MIB: -- Version 1.5, May 2000 I have the raidVTable and the deprecated raidTable in that MIB, but nothing in the "vol" group. We're using ONTAP 6.1.2R3 (yes, I know). I can hack it to use the raidVTable MIB, and it gives some useful info, but I don't think I'm able to properly test it. |
From: Todd L. <tl...@iv...> - 2005-12-09 00:00:35
|
Ed Ravin wanted us to know: >> >Could the following monitor be examined and tested for inclusion in the >> >monitor repository? >> sure, we can add it there. >I will be testing it out in the next couple of days on the NetApps in >my shop. Do you have any documentation for it, in particular the >format of the config file? It's pretty flexible. 1) It looks for netappraidstat.cf in one of two places: /etc/mon/netappraidstat.cf or /usr/lib/mon/etc/netappraidstat.cf The configfile format is one hostname per line. 2) It's designed so that if you already have a netappfree.cf file, you can just symlink netappraidstat.cf to it and it will work (it ignores everything else on the line after the hostname. If more places need to be searched for the configfile, it would be trivial to add (but I lose my cool little oneliner and turn it into if {} elsif {} ... not as pretty, but ok :-) -- Regards... Todd I've visited conferences where the wireless LAN was deemed "secure" by the organisation because they had outlawed sniffers. --Neils Bakker Linux kernel 2.6.12-12mdksmp 2 users, load average: 0.10, 0.09, 0.08 |
From: Ed R. <er...@pa...> - 2005-12-08 20:53:18
|
On Thu, Dec 08, 2005 at 03:37:33PM -0500, Jim Trocki wrote: > On Thu, 8 Dec 2005, Todd Lyons wrote: > > >Could the following monitor be examined and tested for inclusion in the > >monitor repository? > > sure, we can add it there. I will be testing it out in the next couple of days on the NetApps in my shop. Do you have any documentation for it, in particular the format of the config file? |
From: Jim T. <tr...@ar...> - 2005-12-08 20:37:45
|
On Thu, 8 Dec 2005, Todd Lyons wrote: > Could the following monitor be examined and tested for inclusion in the > monitor repository? sure, we can add it there. |
From: Todd L. <tl...@iv...> - 2005-12-08 17:51:31
|
Could the following monitor be examined and tested for inclusion in the monitor repository? I'm using it in production at the moment and it works for me <tm>. With some feedback from one kind soul, I've made a few adjustments from what I posted on the Mon users list a couple days ago(longer vol name space). The following comment should answer most questions: # Borrowed heavily from framework of netappfree.monitor. The only real structural alteration made is that I added 'use strict'. It's just a preference on my part. :-) -- Regards... Todd when you shoot yourself in the foot, just because you are so neurally broken that the signal takes years to register in your brain, it does not mean that your foot does not have a hole in it. --Randy Bush Linux kernel 2.6.12-12mdksmp 2 users, load average: 0.22, 0.25, 0.16 |
From: Ed R. <er...@pa...> - 2005-11-04 04:27:12
|
As previously mentioned, mon-1.1 has some new alert types. Attached is "mymail.alert", a modified version of "mail.alert", that we use at my shop. It supports the new alert types and has a couple of other niceties, like putting the description of the service in the down email. I'm also attaching my version of "snpp.alert" - the original version assumed that all command-line options would be used at invocation time, which caused trouble for me though I forget the details. Happy CVS updating, -- Ed |
From: Ed R. <er...@pa...> - 2005-07-22 05:12:16
|
On Thu, Jul 21, 2005 at 07:50:19PM -0700, Konstantin 'Kastus' Shchuka wrote: > I had to make the following changes to make snpp.alert work: > > --- cvsroot/mon/alert.d/snpp.alert 2004-06-08 22:18:07.000000000 -0700 > +++ cvsroot-mine/mon/alert.d/snpp.alert 2005-07-21 19:15:31.000000000 -0700 > @@ -25,7 +25,7 @@ > # $Id: snpp.alert,v 1.1.1.1 2004/06/09 05:18:07 trockij Exp $ > # > use strict; > -use vars qw /$opt_g $opt_q $opt_s $opt_t/; > +use vars qw /$opt_g $opt_q $opt_s $opt_t $opt_h $opt_l $opt_u/; > use Getopt::Std; > use Net::SNPP; > > @@ -52,7 +52,7 @@ > > my $snpp = Net::SNPP->new ($opt_q) or die; > > -$ALERT = $opt_u ? "UPALERT" : "ALERT"; > +my $ALERT = $opt_u ? "UPALERT" : "ALERT"; > > $snpp->send ( Pager => [ @ARGV ], Message => "$ALERT $opt_g/$opt_s: $summary ($wday $mon $day $tm)" ); Yeah, me too. I guess not too many people were using it. Here's my version, which adds a few more of the features that are supposed to be in all monitor scripts: @@ -47,13 +47,15 @@ my $summary = <STDIN>; chomp $summary; -my $t = localtime ($opt_t); +my $t = localtime ($opt_t || time); my ($wday,$mon,$day,$tm) = split (/\s+/, $t); my $snpp = Net::SNPP->new ($opt_q) or die; -$ALERT = $opt_u ? "UPALERT" : "ALERT"; +my $ALERT= $opt_u ? "UPALERT" : "ALERT"; +my $GROUP= $opt_g || $ENV{MON_GROUP}; +my $SERVICE= $opt_s || $ENV{MON_SERVICE}; -$snpp->send ( Pager => [ @ARGV ], Message => "$ALERT $opt_g/$opt_s: $summary ($wday $mon $day $tm)" ); +$snpp->send ( Pager => [ @ARGV ], Message => "$ALERT $GROUP/$SERVICE: $summary ($wday $mon $day $tm)" ); $snpp->quit; |