From: Schwimmer, E. E *H. <EE...@hs...> - 2006-03-07 19:14:59
|
> I was doing some general thinking the other day (something I normally > try to avoid). Anyways, let us say you configure a device to be > monitored via devmon. Now the device is up and monitored just=20 > fine. But what happens if for some reason the the device is no longer > capable of responding to SNMP. Now it isn't 'down' IP is still pingable,=20 > just SNMP service is locked/hung/whatever. >=20 > First question, is this situation rare ?=20 Not really, it happens frequently on our network. Some devices deprioritize SNMP requests, so if they are very busy or there is a lot of traffic, they just won't answer an SNMP query. =20 Also, since SNMP is UDP, its network layer is "best effort": i.e. the=20 sending host has no idea of the remote host has received the packet or=20 not, unlike TCP (although theoretically some device upstream, possibly even the remote host, should send some sort of ICMP unreachable message, but=20 that's another matter). That's why Devmon tries fairly aggressively to=20 snmp query a remote host (a timeout of 2 seconds with 5 retries being the default options). Now, combine this with the fact that your bb-test binary is most likely not doing conn tests in sync with Devmon's snmp tests, and it increases the likelyhood of green icmp and red snmp all the more. > If not, do you think there should be some internal way for devmon to=20 > handle this ?=20 Currently, Devmon does a little bit of event-smoothing and remediation for hosts that have troublesome SNMP. Devmon should (in theory) throw a clear messages if it is unable to reach a remote host (if you find it throwing yellows or reds, please let me know, it means I've screwed up a color transition somewhere). =20 However, it wont throw a clear until a device has missed SNMP queries for the length pf the "CLEARTIME" timer; this prevents host that have a momentary freak-out from cluttering the nk2 page. Also, if a host has missed a certain number of consecutive poll periods, Devmon will begin to throttle down the number of snmp retries for this host, until it is just sending out a single snmp request. This prevents hosts that are down hard from eating up too much of the snmp poller's time. A single successful snmp query will reset the retries to their default value. > Now, I could setup a remote SNMP network test from the=20 > BB/Hobbit server to catch this. But in X number of minutes your SNMP=20 > reports will be turning purple (I hate purple). I was thinking that it > would be great if devmon performed a SNMP test on each device 1st. All > SNMP devices would have this one common test, call it whatever you like=20 > 'snmp'. If devmon can not communicate with the device via SNMP, this=20 > test would go red, and all other tests for that device would get a clear=20 > status. Similar to how BB/Hobbit perform remote network tests; if conn is=20 > red, all other remote net tests go clear for the device. Typically none of our hosts drop enough SNMP requests to trigger a clear message, at least not without also having conn problems. If there is a need for this (i.e. devices that have naughty SNMP but answer fine to ICMP) I might consider implementing it; but right now my plate is pretty full. > If you wanted to get fancy, you could have the 'snmp' conn test > configureable for each device to pull basic device information, i.e. > Model #, Software/Hardware revision, etc. >=20 > Ofcourse this would add runtime overhead, something I know Eric is > against. Just figured I would bring it up. Eric hate runtime overhead! ERIC SMASH! -Eric >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by xPML, a groundbreaking=20 > scripting language > that extends applications into web and mobile media. Attend=20 > the live webcast > and join the prime developer group breaking into this new=20 > coding territory! > http://sel.as-us.falkag.net/sel?cmd=3Dk&kid=110944&bid$1720&dat=121642 > _______________________________________________ > Devmon-support mailing list > Dev...@li... > https://lists.sourceforge.net/lists/listinfo/devmon-support >=20 |