From: Buchan M. <bg...@st...> - 2008-03-25 13:24:58
|
On Tuesday 25 March 2008 14:24:03 Bill Richardson wrote: > I'm using devmon to monitor some bigip F5. When devmon is first started > "the first poll" all works. During every poll after on one of my F5's > under the pool "PoolMemberStatusReason" is not being populated with > data. "Agian on the first poll it is populted" > > Looking in the devmon.log is this this message over and over every > minute: > "No SNMP data found for PoolMemberStatusReason on bigip1500" > > Starting devmon with the "-f " I see this starting after the 1st poll > "First poll is fine" > ------------------------------------------------------------------------ > --------------------- > ./devmon -f > > SNMP Error: > Error decoding response PDU: > Expected length 9855, got 7996 > %{%i%s%*{%i%i%i%{%@ > ^ > SNMPv2c_Session (remote host: "192.168.1.10" [192.168.1.10].161) > community: "MyPublic" > request ID: -1368482924 > PDU bufsize: 8000 bytes > timeout: 5s > retries: 3 > backoff: 1) > at /usr/local/devmon-0.3.0-rc1/modules/dm_snmp.pm line 540 > Use of uninitialized value in string ne at > /usr/lib/perl5/site_perl/5.8.8/SNMP_Session.pm line 871, <$__ANONIO__> > line 16. > ------------------------------------------------------------------------ > ------------------------ > > Based on this I upgraded from SNMP_Session 1.08 to 1.12 and still see > the same issue. > > I did some tcpdumps and was thinking it may have been the size of the > response.... > > Looking at the dumps I noticed that the first poll response in the trace > has a total length of only 988, while the second poll response has a > length of 1514. > > Additionally I noticed that second poll packet has the fragment bit set. > meaning there should be more packets to follow. I wonder if the perl > script is having trouble handling a fragmented packet. This might have > something to do with the message that I'm seeing that was complaining > about the expected length of the packet. > > So, thinking it has something to do with the size of the response, as a > test I went in to the bigip box (the standby box) and I deleted a bunch > of the pools - like half of them or more. Once I did that, I went in > and checked devmon again. Once it refreshed it was able to load the > pool info for this box. This proves that it is definitely related to > the size of the response that the SNMP bulkget is receiving that is > causing the error. > > My guess is that is that the first request is working because of that > MAX REPETITION setting. If you look at that first response packet, it > returns 12 SNMP items. At some point between that first request and the > second one, devmon must be learning that there are really 129 items > here, not 12. Then when it tries on the second attempt to pull in that > many item responses they cause the error. Whether it is the packet > fragmenting that is the cause or something else related to the size of > the response, I'm not sure. > > Any help with this one... I have looked and looked dont see this > reported on the list in the past.. I'm currently running perl-SNMP_Session 1.08, and monitoring some cisco devices with more than 180 items per test. It might be best if you could send captured files from tcpdump or wireshark. However, this looks more like an issue with SNMP_Session, than with devmon. But, I'll try and investigate. Regards, Buchan |