From: Asif I. <va...@gm...> - 2007-07-30 20:21:48
|
Once in a while the DEVMON column turn into purple. I have to send a HUP signal to devmon process to get the new data. I have done that twice in last month. -- Asif Iqbal PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu |
[Devmon] Core file of devmon,
snmp version discovery problem and Strange Behaviour of cisco CPU
data
From: Francesco D. <fdu...@q8...> - 2007-07-30 21:43:38
|
Hi all,=20 I've done a search but I've found nothing on this on the archive and it could be interesting to anyone trying to understand why template copied don't work. Today I started to move all the object I had monitored with bb-xsnmp.pl to devmon and I got some object with a clear status that cannot get any data. On all the device list I've many old devices and devices probably with older version of IOS because they're externally managed and we cannot upgrade them. First of all it seems that if the device is discovered with snmpv1 (and not capable of snmpv2) and the template have a snmpver =3D 2 DEVMON try = to get the date using SNMPV2 instead of what is discovered (probably because this information is not saved on the DEVMON database on discovery). Second this regarding cisco it seems that the method of gathering CPUtime differs from version to version of IOS and if the oid is wrong the error will block completely also other test on that device. The oid for getting cpu are those: Version =09 Version IOS 12.2(3.5) or later IOS from 12.0(3)T to 12.2(3.5) IOS prior to 12.0(3)T MIB CISCO-PROCESS MIB CISCO-PROCESS MIB OLD-CISCO-CPU MIB Objects cpmCPUTotal5minRev cpmCPUTotal5min avgBusy5=20 (.1.3.6.1.4.1.9.9.109.1.1.1.1.8) (.1.3.6.1.4.1.9.9.109.1.1.1.1.5) (.1.3.6.1.4.1.9.2.1.58) cpmCPUTotal1minRev cpmCPUTotal1min avgBusy1=20 (.1.3.6.1.4.1.9.9.109.1.1.1.1.7) (.1.3.6.1.4.1.9.9.109.1.1.1.1.4) (.1.3.6.1.4.1.9.2.1.57) cpmCPUTotal5secRev cpmCPUTotal5sec busyPer=20 (.1.3.6.1.4.1.9.9.109.1.1.1.1.6) (.1.3.6.1.4.1.9.9.109.1.1.1.1.3) (.1.3.6.1.4.1.9.2.1.56) The oldest ones (busyPer) seems to work also on newer devices while the newest don't work on oldest. This is a link to the web page that talk about CPU monitoring http://www.cisco.com/warp/public/477/SNMP/collect_cpu_util_snmp.html A third issue but I'm not sure on how and when it happen is that I seem to have a good number of core files generated by devmon on each run for example now I have 13 in the last 25 minutes: -rw------- 1 bb bb 12759040 Jul 30 23:35 core.15095 -rw------- 1 bb bb 12759040 Jul 30 23:35 core.15026 -rw------- 1 bb bb 12759040 Jul 30 23:34 core.14951 -rw------- 1 bb bb 12759040 Jul 30 23:18 core.25031 -rw------- 1 bb bb 12759040 Jul 30 23:17 core.21863 -rw------- 1 bb bb 12759040 Jul 30 23:16 core.21847 -rw------- 1 bb bb 12759040 Jul 30 23:16 core.21840 -rw------- 1 bb bb 12759040 Jul 30 23:16 core.21812 -rw------- 1 bb bb 12759040 Jul 30 23:10 core.14769 -rw------- 1 bb bb 12759040 Jul 30 23:08 core.10720 -rw------- 1 bb bb 12759040 Jul 30 23:07 core.10233 -rw------- 1 bb bb 12759040 Jul 30 23:03 core.4657 -rw------- 1 bb bb 12759040 Jul 30 23:02 core.1331 -rw------- 1 bb bb 12759040 Jul 30 23:01 core.1326 -rw------- 1 bb bb 12759040 Jul 30 23:00 core.1217 -rw------- 1 bb bb 12759040 Jul 30 23:00 core.1209 Doing some check it seems that it's the kill of the process exceding polling time as in this example also if some core seems to not appear (i don't have a core for the 15005 pid but i've for some 15026 and 15095: [07-07-30@23:36:23] Fork 12 (15005) exceeded poll time polling ciscona [07-07-30@23:36:23] Fork 12 (15005) exceeded poll time polling ciscona bb 14991 1 4 23:35 ? 00:00:06 devmon[master] bb 14993 14991 0 23:35 ? 00:00:00 devmon bb 14994 14991 0 23:35 ? 00:00:00 devmon bb 14995 14991 0 23:35 ? 00:00:00 devmon bb 14996 14991 0 23:35 ? 00:00:00 devmon bb 14997 14991 0 23:35 ? 00:00:00 devmon bb 14998 14991 0 23:35 ? 00:00:00 devmon bb 14999 14991 0 23:35 ? 00:00:00 devmon bb 15001 14991 0 23:35 ? 00:00:00 devmon bb 15002 14991 0 23:35 ? 00:00:01 devmon bb 15003 14991 0 23:35 ? 00:00:00 devmon bb 15004 14991 0 23:35 ? 00:00:00 devmon bb 15006 14991 0 23:35 ? 00:00:00 devmon bb 15007 14991 0 23:35 ? 00:00:00 devmon bb 15008 14991 0 23:35 ? 00:00:00 devmon bb 15009 14991 0 23:35 ? 00:00:01 devmon bb 15010 14991 0 23:35 ? 00:00:01 devmon bb 15011 14991 0 23:35 ? 00:00:00 devmon bb 15012 14991 0 23:35 ? 00:00:00 devmon bb 15013 14991 0 23:35 ? 00:00:00 devmon bb 15113 14991 0 23:36 ? 00:00:00 devmon I'm using devmon 0.3.0-beta2 and it's running on RHEL5 x86-64 with perl v5.8.8 built for x86_64-linux-thread-multi (the one shipped with RHEL5) Thanks for the really good program |