From: Mario V. <mar...@gm...> - 2012-01-27 09:28:47
|
I'm working on standard, unmodified templates. In particular with 3750, 4948 and 2801 devices. Thanks for the advice on the 'multigraphs=disk,if_load,if_err,temp' to CGI_SVC_OPTS in hobbitcgi.cfg or equivalent'. It's already a step forward. Regarding the thresholds; in my case, rather than wanting to check error packets as a percentage of bandwidth, I want to be alerted if and when an error is detected at all. ¨Perhaps a threshold check with the "* {ifInErrors}"* and "*{ifOutErrors}" *will give me what I need. I'll play around with it. Thanks again, Mario On 25 January 2012 15:15, Buchan Milne <bg...@st...> wrote: > On Tuesday, 24 January 2012 13:26:01 Mario Valetti wrote: > > Hi, > > > > When selecting 'network device | <if_err>' column, the resultant table > > shows total interface errors since the device start / reset. > > Can you provide the name of the template you are using for the device in > question (if it is a standard, unmodified template), or the template itself > you are using (if not standard unmodified)? > > > I'm > > only interested in checking if errors occured within the previous 24 or > > 48hrs (for example). > > The transforms file on most of the if_err tests includes: > > # Get bit speed delta (so we dont have to provide custom delta limit) > ifInOps : DELTA : {ifInOctets} > ifOutOps : DELTA : {ifOutOctets} > # Convert our octets delta into bits per second > ifInBps : MATH : {ifInOps} x 8 > ifOutBps : MATH : {ifOutOps} x 8 > # Do delta transform on all error counters > ifInEps : DELTA : {ifInErrors} > ifOutEps : DELTA : {ifOutErrors} > # Perform error to traffic percentage calculations > ifInErrPct : MATH : ({ifInEps} / {ifInBps}) x 100 > ifOutErrPct : MATH : ({ifOutEps} / {ifOutBps}) x 100 > # Create an alias in a bracketed box, or nothing if alias is blank > ifAliasBox : REGSUB : {ifAlias} /(\S+.*)/ [$1]/ > > And the thresholds: > # Create thresholds for all the error rate counters > # oid name : color : limit : Error message > > ifInErrPct : yellow : 5 : {ifName}{ifAliasBox} - High input > error rate ({ifInErrPct}%) > ifInErrPct : red : 10 : {ifName}{ifAliasBox} - Very high > input error rate ({ifInErrPct}%) > ifOutErrPct : yellow : 5 : {ifName}{ifAliasBox} - High > output > error rate ({ifOutErrPct}%) > ifOutErrPct : red : 10 : {ifName}{ifAliasBox} - Very high > output error rate ({ifOutErrPct}%) > > > So, all alarming is for Error packets per second / Bits per second (so, > unfortunately, not a real percentage) averaged over the poll period. > > It may instead be better to make it 'Error packets per second' / ('Unicast > pps' + 'Multicast pps' + 'Broadcast pps') to give a real percentage. > > However, your assertion that the table holds the total errors since counter > reset is false (at least for most if_err tests). > > Feel free to send a patch ... > > > I know the if_err graphs are available under the "trends" column, but is > it > > possible to view this period for errors directly in the table within the > > if_err column check? > > You can get the graphs on the if_err page as well by adding e.g. '-- > multigraphs=disk,if_load,if_err,temp' to CGI_SVC_OPTS in hobbitcgi.cfg or > equivalent. > > > Regards, > Buchan > |