From: Colin C. <col...@gm...> - 2010-12-09 01:03:01
|
On Tue, Dec 7, 2010 at 10:26 PM, Buchan Milne <bg...@st...> wrote: > On Tuesday, 7 December 2010 07:01:03 Colin Coe wrote: >> Hi all >> >> I'm having a problem with devmon and some HP Proliant DL380's. The >> problem is simply that devmon is able to extract info (raid, power, >> temp, etc) from some but not others. > > This can occur when a test has some oids that may not be populated at all > (e.g. using compaq-server on HP ProLiants with no hot spare configured end up > with a failure polling sprDrvCntIndex, devmon gives up and doesn't poll - say > - fans). > > I think I fixed this in svn with this commit: > > http://devmon.svn.sf.net/viewvc/devmon?view=revision&revision=156 > > (see the explanation there) > > If this is not the issue, can you give more information (which tests are > clear, details from verbose or debug logging indicating what occurs)? > >> To make matters less clear, on a >> couple of the hosts where devmon works, I *cannot* snmpwalk from the >> devmon/xymon server and on hosts where devmon doesn't work I *can* >> snmpwalk. > > This makes no sense ... please confirm you are using the correct details > devmon uses (e.g. 'cat -v hosts.db' will show records separated by '^[' > sequences). You may also want to do some packet tracing (with tcpdump or > wireshark) if you really can't sort this out. > >> I've looked at the oids file and snmpwalked (for example "snmpwalk -v >> 2c -c COMMUNITY_STR server 1.3.6.1.4.1.232.6.2.6.8.1.1") on all the >> hosts where the "compaq;server" tests are clear, and found that I >> could "proper" responses from most of them. > > Check all the oids from all tests, or apply the patch from svn, or test with > svn trunk. > >> The mix of working machines includes Linux (RHEL4 & 5) and Windows. >> The non-working machines are all Linux (RHEL4 & 5). >> >> All Linux nodes have the HP PSP RPMs installed (specifically >> hp-snmp-agents) and include "dlmod cmaX /usr/lib64/libcmaX64.so" at >> the top of /etc/snmp/snmpd.conf. >> >> /etc/snmp/snmpd.conf on all Linux hosts has the line: "rocommunity >> COMMUNITY_STR monhost.company.com" where monhost.company.com is the >> devmon/xymon server. >> >> The SNMP service has been restarted. >> >> This has me stumped. > > > Regards, > Buchan > Hi I grabbed the code out of the SVN on SF and rebuilt. It looked a lot better but what I found was on one of the Proliants with a failed drive but no spare, the test came back clear not red. I made this change: # diff -u dm_tests.pm.orig dm_tests.pm ---- --- dm_tests.pm.orig 2010-12-09 08:07:04.000000000 +0800 +++ dm_tests.pm 2010-12-09 08:18:14.000000000 +0800 @@ -1820,9 +1820,9 @@ # Make sure we have leaf data for our primary oid if(!defined $oids->{$pri}{'val'}) { - do_log("Missing repeater data for $pri for $test msg", 0); - $msg .= "&clear Missing repeater data for primary OID $pri\n"; - $worst_color = 'clear'; + do_log("Warning: missing repeater data for $pri for $test msg", 0); +# $msg .= "&clear Missing repeater data for primary OID $pri\n"; +# $worst_color = 'clear'; next; } ---- and it now comes back red although I don't know what the potential badness of this change is (not knowing the code). As for what I can snmpwalk and what I can't, I'm going to have to plead insanity. After some sleep and looking at it again I see where I went wrong on those hosts Thanks CC -- RHCE#805007969328369 |