From: Chris W. <ch...@su...> - 2008-04-18 14:17:43
|
Hello, Chiming in on some info on Devmon. While primarily targeted to the Devmon list, it may be useful to hobbit/devmon users who don't subscribe to that list. The cisco-7206 template works perfectly fine on a Cisco 7500. I'm sure it works on a 7200 as well. I also have an old 7000 here, but I don't want to boot it up to test. Anyway, it may be in the best interest to rename 7206 to 7200, and just copy its templates to a 7500 folder, or genericly rename the whole thing cisco-7000. Also, there is a typo in the USING doc: http://devmon.svn.sourceforge.net/viewvc/devmon/trunk/docs/USING?revision=3&view=markup This line is listed: DEVMON:tests(cpu),thresh(cpu;CPUTotal5Min;y=50;r=90) But it should be: DEVMON:tests(cpu),thresh(cpu;CPUTotal5Min;y:50;r:90) It's correct in the details furter down the page, but the equal symbols should be colons near the top when it first mentions thresh(). Lastly, and this is very minor, Devmon doesn't properly detect administratively down interfaces in all cases. On one router, I am using subinterfaces as follows: GigabitEthernet0/2 GigabitEthernet0/2.1 GigabitEthernet0/2.2 GigabitEthernet0/2.3 ..etc.. If I shut down Gi0/2, 'sh ip int br' shows its subinterfaces administratively down, but devmon doesn't detect that- one has to go into each subinterface and shut them down as well. It does appear that the OID that checks admin status (.1.3.6.1.2.1.2.2.1.7) does indeed say up, which is why it's showing red: ifAdminStatus.89 = INTEGER: up(1) I couldnt find any alternate OID to report ifAdminStatus, so short of putting in code to check parent interface status, it probably couldn't be considered a bug, but I thought I'd mention it. --Chris |
From: Robert H. <rob...@gm...> - 2008-04-18 15:47:30
|
I have noticed quite a bit of (unnecessary) redundancy when it comes to the cisco templates. I have been able to reduce nearly all the cisco devices down to two templates: cisco-switch and cisco-common I still have a few minor issues to deal with, but should have something to post to the group in about a weeks time. The biggest of these issues is finding something in the specs "model" that is common to the cisco-switch (2811, 4003, 5500, & 6506), that is not found in all the other devices. Simularily, I would like to find something in the specs "model" that is common to all other cisco devices (cisco-common). note: Many switches are still able to use cisco-common (2900, 3500, 3550, etc), so I probably have to come up with a better name for cisco-switch. I will see what I can find on your subinterfaces issue. I am also working on an idea (change to devmon) to allow for "default" templates depending on vendor. Robert Holden On Fri, Apr 18, 2008 at 7:17 AM, Chris Wopat <ch...@su...> wrote: > Hello, > > Chiming in on some info on Devmon. While primarily targeted to the Devmon > list, it may be useful to hobbit/devmon users who don't subscribe to that > list. > > The cisco-7206 template works perfectly fine on a Cisco 7500. I'm sure it > works on a 7200 as well. I also have an old 7000 here, but I don't want to > boot it up to test. Anyway, it may be in the best interest to rename 7206 to > 7200, and just copy its templates to a 7500 folder, or genericly rename the > whole thing cisco-7000. > > Also, there is a typo in the USING doc: > > > http://devmon.svn.sourceforge.net/viewvc/devmon/trunk/docs/USING?revision=3&view=markup > > This line is listed: > DEVMON:tests(cpu),thresh(cpu;CPUTotal5Min;y=50;r=90) > > But it should be: > DEVMON:tests(cpu),thresh(cpu;CPUTotal5Min;y:50;r:90) > > It's correct in the details furter down the page, but the equal symbols > should be colons near the top when it first mentions thresh(). > > Lastly, and this is very minor, Devmon doesn't properly detect > administratively down interfaces in all cases. On one router, I am using > subinterfaces as follows: > > GigabitEthernet0/2 > GigabitEthernet0/2.1 > GigabitEthernet0/2.2 > GigabitEthernet0/2.3 > ..etc.. > > If I shut down Gi0/2, 'sh ip int br' shows its subinterfaces > administratively down, but devmon doesn't detect that- one has to go into > each subinterface and shut them down as well. It does appear that the OID > that checks admin status (.1.3.6.1.2.1.2.2.1.7) does indeed say up, which is > why it's showing red: > > ifAdminStatus.89 = INTEGER: up(1) > > I couldnt find any alternate OID to report ifAdminStatus, so short of > putting in code to check parent interface status, it probably couldn't be > considered a bug, but I thought I'd mention it. > > --Chris > > To unsubscribe from the hobbit list, send an e-mail to > hob...@hs... > > > |
From: Buchan M. <bg...@st...> - 2008-04-22 08:16:25
|
On Friday 18 April 2008 17:47:21 Robert Holden wrote: > I have noticed quite a bit of (unnecessary) redundancy when it comes to the > cisco templates. Why do you think it is specific to cisco templates? E.g., the if_load template works just as well with any device that supports the RFC-standard IFMIB (e.g. the linux-openwrt template has the if_load test taken almost directly from a cisco device). The only differences are really how devices are named, and thus maybe default device patterns that should be ignored. > I have been able to reduce nearly all the cisco devices > down to two templates: cisco-switch and cisco-common > I still have a few minor issues to deal with, but should have something to > post to the group in about a weeks time. The biggest of these issues is > finding something in the specs "model" that is common to the cisco-switch > (2811, 4003, 5500, & 6506), that is not found in all the other devices. > Simularily, I would like to find something in the specs "model" that is > common to all other cisco devices (cisco-common). > > note: Many switches are still able to use cisco-common (2900, 3500, 3550, > etc), so I probably have to come up with a better name for cisco-switch. Well, the issue is that you shouldn't really distinguish features on a device based on the hardware model in the first place. If we stick to the Cisco topic, is a 6500 a switch? Is a 7600 a router? What if I put a better supervisor in the 6500 ? If I put a CSM blade into a 6500, or into a 7600, is one a load balancer and the other not? Moving on, if I run a RADIUS server (which supports the RADIUS MIB) on a HP ProLiant, is a Dell PowerEdge *not* a RADIUS server? So, yes, I think we need a new approach to: 1)Which tests are done on a specific device 2)Which tests are done by default on a device of a specific kind of hardware > I will see what I can find on your subinterfaces issue. IMHO, if the device lies over SNMP, you should report it to the vendor, rather than workaround the problem in an SNMP manager. > I am also working on an idea (change to devmon) to allow for "default" > templates depending on vendor. I would prefer that you discuss any design issues on the development list ... > > Robert Holden > > On Fri, Apr 18, 2008 at 7:17 AM, Chris Wopat <ch...@su...> wrote: > > Hello, > > > > Chiming in on some info on Devmon. While primarily targeted to the Devmon > > list, it may be useful to hobbit/devmon users who don't subscribe to that > > list. > > > > The cisco-7206 template works perfectly fine on a Cisco 7500. I'm sure it > > works on a 7200 as well. I also have an old 7000 here, but I don't want > > to boot it up to test. Anyway, it may be in the best interest to rename > > 7206 to 7200, and just copy its templates to a 7500 folder, or genericly > > rename the whole thing cisco-7000. > > > > Also, there is a typo in the USING doc: > > > > > > http://devmon.svn.sourceforge.net/viewvc/devmon/trunk/docs/USING?revision > >=3&view=markup > > > > This line is listed: > > DEVMON:tests(cpu),thresh(cpu;CPUTotal5Min;y=50;r=90) > > > > But it should be: > > DEVMON:tests(cpu),thresh(cpu;CPUTotal5Min;y:50;r:90) > > > > It's correct in the details furter down the page, but the equal symbols > > should be colons near the top when it first mentions thresh(). > > > > Lastly, and this is very minor, Devmon doesn't properly detect > > administratively down interfaces in all cases. On one router, I am using > > subinterfaces as follows: > > > > GigabitEthernet0/2 > > GigabitEthernet0/2.1 > > GigabitEthernet0/2.2 > > GigabitEthernet0/2.3 > > ..etc.. > > > > If I shut down Gi0/2, 'sh ip int br' shows its subinterfaces > > administratively down, but devmon doesn't detect that- one has to go into > > each subinterface and shut them down as well. It does appear that the OID > > that checks admin status (.1.3.6.1.2.1.2.2.1.7) does indeed say up, which > > is why it's showing red: > > > > ifAdminStatus.89 = INTEGER: up(1) > > > > I couldnt find any alternate OID to report ifAdminStatus, so short of > > putting in code to check parent interface status, it probably couldn't be > > considered a bug, but I thought I'd mention it. > > > > --Chris > > > > To unsubscribe from the hobbit list, send an e-mail to > > hob...@hs... |
From: Robert H. <rob...@gm...> - 2008-04-22 19:10:49
|
On Tue, Apr 22, 2008 at 1:09 AM, Buchan Milne <bg...@st...> wrote: > On Friday 18 April 2008 17:47:21 Robert Holden wrote: > > I have noticed quite a bit of (unnecessary) redundancy when it comes to > the > > cisco templates. > > Why do you think it is specific to cisco templates? E.g., the if_load > template > works just as well with any device that supports the RFC-standard IFMIB > (e.g. > the linux-openwrt template has the if_load test taken almost directly from > a > cisco device). The only differences are really how devices are named, and > thus maybe default device patterns that should be ignored. > Most of the equipment we are monitoring is cisco , as hobbit is used to monitor all our servers. As a result, I do not have enough experience with SNMP as it relates to servers to answer your question. As for RFC-standard IFMIB, you are right, all cisco devices should follow these standards, but these relate to Interfaces on the devices. But having a static oid for all interfaces will not always work: ------------------------------------------------------------------------------------------------------- ifSpeed [ifBps] (1.3.6.1.2.1.2.2.1.5) vs ifHighSpeed ( 1.3.6.1.2.1.31.1.1.1.15) The range of ifSpeed is limited to reporting a maximum speed of (2**31)-1 bits/second, or approximately 2.2Gbs. SONET defines an OC-48 interface, which is defined at operating at 48 times 51 Mbs, which is a speed in excess of 2.4Gbs. Thus, ifSpeed is insufficient for the future, and this memo defines an additional object: ifHighSpeed. The ifHighSpeed object reports the speed of the interface in 1,000,000 (1 million) bits/second units. Thus, the true speed of the interface will be the value reported by this object, plus or minus 500,000 bits/second. [RFC 2233 <http://www1.tools.ietf.org/html/rfc2233>, 3.1.7] ------------------------------------------------------------------------------------------------------- ifInOctets (.1.3.6.1.2.1.2.2.1.10) vs ifHCInOctets (1.3.6.1.2.1.31.1.1.1.6) ifOutOctets (1.3.6.1.2.1.2.2.1.16) vs ifHCOutOctets (1.3.6.1.2.1.31.1.1.1.10 ) As the speed of network media increase, the minimum time in which a 32 bit counter will wrap decreases. For example, a 10Mbs stream of back-to-back, full-size packets causes ifInOctets to wrap in just over 57 minutes; at 100Mbs, the minimum wrap time is 5.7 minutes, and at 1Gbs, the minimum is 34 seconds. Requiring that interfaces be polled frequently enough not to miss a counter wrap is increasingly problematic. [RFC 2233 <http://www1.tools.ietf.org/html/rfc2233>, 3.1.6] As devmon polls data every 5 minutes, it probably should use the HC versions of counters when needed (Gb+ speeds). Is there a transform for performing an IF statement/substitution? Example: IF the ifSpeed > 20Mb, use ifHCInOctets instead of ifInOctets. For interfaces that operate at 20,000,000 (20 million) bits per second or less, 32-bit byte and packet counters MUST be used. For interfaces that operate faster than 20,000,000 bits/second, and slower than 650,000,000 bits/second, 32-bit packet counters MUST be used and 64-bit octet counters MUST be used. For interfaces that operate at 650,000,000 bits/second or faster, 64-bit packet counters AND 64-bit octet counters MUST be used. [RFC 2233 <http://www1.tools.ietf.org/html/rfc2233>, 3.1.6] Some tests, such as serial, fans & power have some differences from device to device. At times an OID is not available (power/fans), other times, the information is only available under a different OID (serial). So this creates some difference between templates (hence cisco-common vs cisco-switch in my previous email). > > I have been able to reduce nearly all the cisco devices > > down to two templates: cisco-switch and cisco-common > > I still have a few minor issues to deal with, but should have something > to > > post to the group in about a weeks time. The biggest of these issues is > > finding something in the specs "model" that is common to the > cisco-switch > > (2811, 4003, 5500, & 6506), that is not found in all the other devices. > > Simularily, I would like to find something in the specs "model" that is > > common to all other cisco devices (cisco-common). > > > > note: Many switches are still able to use cisco-common (2900, 3500, > 3550, > > etc), so I probably have to come up with a better name for cisco-switch. > > Well, the issue is that you shouldn't really distinguish features on a > device > based on the hardware model in the first place. > > If we stick to the Cisco topic, is a 6500 a switch? Is a 7600 a router? > What > if I put a better supervisor in the 6500 ? If I put a CSM blade into a > 6500, > or into a 7600, is one a load balancer and the other not? > > Moving on, if I run a RADIUS server (which supports the RADIUS MIB) on a > HP > ProLiant, is a Dell PowerEdge *not* a RADIUS server? > > So, yes, I think we need a new approach to: > 1)Which tests are done on a specific device > 2)Which tests are done by default on a device of a specific kind of > hardware > What about IOS vs CATOS, or differences between versions of IOS? I have yet to come up with a better way to do this, but I thinking it will be along the lines of: 1. SNMP Get manufacturer 2. SNMP Get hardware model 3. SNMP Get OS & OS Version 4. SNMP Get Software & Version ?? 5. Run appropriate tests Unfortunately, this can mess up the nice & clean layout to the templates that devmon has now. > > I will see what I can find on your subinterfaces issue. > > IMHO, if the device lies over SNMP, you should report it to the vendor, > rather > than workaround the problem in an SNMP manager. > > > I am also working on an idea (change to devmon) to allow for "default" > > templates depending on vendor. > > I would prefer that you discuss any design issues on the development list > ... > I just signed up for the devmon-devel list. https://lists.sourceforge.net/lists/listinfo/devmon-devel I will post my ideas for changes & templates to that list. Robert |
From: Buchan M. <bg...@st...> - 2008-04-22 09:36:06
|
On Friday 18 April 2008 16:17:35 Chris Wopat wrote: > Hello, > > Chiming in on some info on Devmon. While primarily targeted to the > Devmon list, it may be useful to hobbit/devmon users who don't subscribe > to that list. > > The cisco-7206 template works perfectly fine on a Cisco 7500. I'm sure > it works on a 7200 as well. I also have an old 7000 here, but I don't > want to boot it up to test. Anyway, it may be in the best interest to > rename 7206 to 7200, and just copy its templates to a 7500 folder, or > genericly rename the whole thing cisco-7000. > > Also, there is a typo in the USING doc: > > http://devmon.svn.sourceforge.net/viewvc/devmon/trunk/docs/USING?revision=3 >&view=markup > > This line is listed: > DEVMON:tests(cpu),thresh(cpu;CPUTotal5Min;y=50;r=90) > > But it should be: > DEVMON:tests(cpu),thresh(cpu;CPUTotal5Min;y:50;r:90) I've fixed this locally (I ran into it myself earlier but was too busy to fix it). I'll commit it later. > It's correct in the details furter down the page, but the equal symbols > should be colons near the top when it first mentions thresh(). > > Lastly, and this is very minor, Devmon doesn't properly detect > administratively down interfaces in all cases. On one router, I am using > subinterfaces as follows: > > GigabitEthernet0/2 > GigabitEthernet0/2.1 > GigabitEthernet0/2.2 > GigabitEthernet0/2.3 > ..etc.. > > If I shut down Gi0/2, 'sh ip int br' shows its subinterfaces > administratively down, but devmon doesn't detect that- one has to go > into each subinterface and shut them down as well. It does appear that > the OID that checks admin status (.1.3.6.1.2.1.2.2.1.7) does indeed say > up, which is why it's showing red: > > ifAdminStatus.89 = INTEGER: up(1) Right, so the router is lying to you. I would prefer not to workaround device bugs in devmon itself. If you can, you should log a TAC case regarding this (e.g. "Interface status reported via SNMP does not match the configured status"). In the mean time you can work around it with exceptions in the bb-hosts file, such as: DEVMON:except(if_stat;ifName;na:Gi\d+/\d+\.\d+) (which would ignore the if_status for all GigabitEthernet sub-interfaces, or you could make it more specific if you want). Regards, Buchan |