From: W.J.M. N. <Wim...@nl...> - 2017-01-16 14:42:39
|
Hello Peter, > There is one network object that is queried by two vm's (xymon-dev, > xymon-acc) and one hw box. > The two vm's are preps for a new xymon server with the latest and greatest; > the hw box is our current xymon prod box. > > Only the hw box works as expected. The two vm's are similar build and > produces spikes at different times. So the spikes seem to result from a timing problem. Do you see messages containing "duplicate RRD data with same timestamp" in /var/log/xymon/rrd-status.log? > What the three hosts have in common is the Xymon-html-output for this > network device. Which shows normal output. The hidden devmon code is also > normal. A batch job that queries the network device also produces normal > output (your previous suggestion, as I recall). You've shown the list of names of the interfaces, retrieved using snmpwalk. Some interface names appear two times. This implies that two updates are sent at the same time for some interfaces to RRD. From what I've seen, the value at the second occurrence of an interface name is always zero. If that one is handled before the other one, you'll get a spike. > So I have rrd files that when I look into these files show absurd values. > But that could be perhaps an error in the processing of the hidden devmon > code? Or an rrd anomaly? Or just my own 'unknown error'? The problem does not occur at xymon running on bare metal. My guess is that if you retrieve the list of interfaces from that machine, there are no duplicate interface names in it. Thus, probably you're problem will be gone if you move from a VM to bare metal. > > What I just did: > I tried to minimize the shown devices. So I managed to do so, by re-using a > field shown (Errors in) from the If_Err-test. This I do not understand. > But underneath all the > 64-bits interfaces are queried, so devmon output is still shown for all the > devices. > > <!--DEVMON RRD: if_load 0 0DS:ds0:COUNTER:600:0:U > DS:ds1:COUNTER:600:0:Umgmt0 8879632404:16047447934loopback0 > 15094564810:145416650Ethernet3_1 5145707654:5008405830Ethernet3_2 > 275179849444513:598883108402034Ethernet3_3 > 1100302946652053:951076190448071Ethernet3_4 > 51407929508676:69081629766292Ethernet3_5 > 823099309481:5352820011173Ethernet3_6 199285632:199285632Ethernet3_7 > 199285632:199285632Ethernet3_8 199285632:199285632Ethernet4_1 > 3118632864693475:491943590772051Ethernet4_2 > 414769440784827:649329384941863Ethernet4_3 > 103430757126278:415216974406768Ethernet4_4 > 156480646041793:681504646556060Ethernet4_5 > 299330487390116:1361606088437009Ethernet4_6 > 45595170267863:313429801053492Ethernet4_7 > 199285632:199285632Ethernet4_8 199285632:199285632Ethernet3_1 > 0:0Ethernet3_2 0:0Ethernet3_3 0:0Ethernet4_1 0:0Ethernet4_2 > 0:0Ethernet4_3 0:0Ethernet4_4 0:0Ethernet4_5 0:0Ethernet4_6 > 0:0Ethernet3_4 0:0Ethernet3_5 0:0--> > > And now I will try your suggestion using exceptions file: > > [root@uhu-o if_load]# cat exceptions > > #ifName : alarm : .+ > > ifName : ignore : Nu.+|Vl.+|VLAN.+ > > ifHCInOctets : only : 0 > Unfortunately the output remains the same as above (for all counters). > Perhaps I'm missing something here...? > Sorry, I've made an error in this configuration snippet. The idea is to ignore those interfaces which show an octet counter with value 0, as those are (almost always) the duplicate interfaces. The snippet should read: ifHCInOctets : ignore : 0 Which version of devmon are you using? In the version at sourceforge, devmon-0.3.1-beta1.tar.gz, a parameter 'all' is used in the rrd directive of the TABLE command in file message. In a later patch, available via URL https://sourceforge.net/p/devmon/code/HEAD/tree/trunk/, this has been changed a bit. How does your message file looks like? Regards, Wim Nelis. |