I have my Internet connection running through a tunnel over the @Home
network (i.e. a cable modem at my place and another cable modem at the
location where my network is routed to an ISDN termination). It has
been OK, and I guess quite good given the price one pays for such
service compared to what equivalent bandwidth on a telco local loop
would cost (probably over $2500/month in my case). However some days,
and perhaps on rainy days especially, it's really bad. The latency
stays quite low (under 100ms average if you first toss the exceptions
that appear just after the connection recovers), but the packet loss
goes up well over 15% and every time you go to type something you face
major hesitation to the extent that it feels as if a 28.8kbps modem
would be faster again! Here's an example from a normal one-packet-
per-second ping running overnight (to the gateway on the other remote
cable modem):
57683 packets transmitted, 46032 packets received, +8 duplicates, 20.2% packet loss
round-trip min/avg/max/stddev = 13.888/283.061/182622.441/4992.750 ms
(I'm not sure I trust the max & stddev, or even the packet counters for
that matter because more than 64000 seconds (but not more than 65536)
have elapsed since that ping was started.)
So I did some experimentation with various versions of ping and found
that I could get a measurement that was reasonable approximation of a
continuous ping, but at a much lower cost, by sending about 10 packets
in a quick burst every minute (i.e. in Cisco ping style where you send
another packet immediately after the previous reply has been received).
Now I'm not a statistician by trade (and in fact know relatively little
about statistics in detail) but it seems to me that what I've observed
at least in the packet loss graphs over the past 16 hours or so of data
collecting match my end user experiences. The RTT graphs may not be
quite so useful or accurate unless you can be 100% sure that in-transit
ICMP packets are given equal priority by all intermediate routers and
that your ping target is either extremely lightly loaded or can give
some assurance that it'll give ICMP echo requests equal priority to
other traffic and internal tasks (i.e. don't ping a Cisco unless you
want to mix your network measurements with measuring its load level).
Anyway I thought I'd post this in hopes of getting any feedback. I
consult to another cable company that's using the same Terayon modems
I've been given by my local provider and I think they'd be interested in
seeing reliable packet loss measurements to various parts of their cable
plant (though I don't know that they can afford to deploy devices to be
used strictly for measuring upper-level traffic integrity).
I'm also wondering if anyone knows any tricks that would allow me to
display a new finer-grained graphs to represent the 60-second samples,
and also whether or not the 5-minute intervals in the hourly graph are
accurately representing the averages for the one minute samples. Please
post any comments or suggestions for discussion.
You need to put this config file (and another listing a set of targets
naming the devices you want to ping) in a separate config tree that you
run the collector over once per minute (as opposed to once ever 5
minutes). I've done this by defining a separate set of directories in
the subtree-sets file, called "fine", and invoke collect-subtrees
separately from cron like this:
#minute hour mday month wday command
#
*/5 * * * * /usr/cricket/bin/collect-subtrees normal
* * * * * /usr/cricket/bin/collect-subtrees fine
Note that you have to explicitly name the "normal" set else it will
collect all sets by default (at least in 1.0.2).
Note also that this config adds new "rra" entries and will produce
larger RRD files.
The script following this file goes in util/rtt and you need to
carefully read the comments about various versions of ping in that
script. I strongly recommend you get Eric Wassenaar's version with the
'-F' option, especially if you want to measure more than one device.
# net-rtt sub-tree
#
# This is where we collect stats on the reachability of various remote hosts
Target --default--
directory-desc = "Miscellaneous Network Statistics"
short-desc = "Network Response"
long-desc = "Network RTT and packet loss statistics."
# WARNING: make sure you collect this sub-tree every 60 seconds!
rrd-poll-interval = 60
# you'll probably want to change this... unless you have
# Cricket in ~/cricket and your config tree in ~/cricket-config
util-dir = %auto-base%/../util
target-type = net-monitor
host = %auto-target-name%
# try tromping on these:
snmp-host = ""
snmp-uptime = ""
datasource minRTT
ds-source = "exec:0:%util-dir%/rtt %host%"
desc = "The minimum round trip time for packets (normally 10
packets sent per monitoring interval of
%rrd-poll-interval% seconds)"
rrd-ds-type = GAUGE
datasource avgRTT
ds-source = "exec:1:%util-dir%/rtt %host%"
desc = "The average round trip time for packets (normally 10
packets sent per monitoring interval of
%rrd-poll-interval% seconds)"
rrd-ds-type = GAUGE
datasource maxRTT
ds-source = "exec:2:%util-dir%/rtt %host%"
desc = "The maximum round trip time for packets (normally 10
packets sent per monitoring interval of
%rrd-poll-interval% seconds)"
rrd-ds-type = GAUGE
datasource percentLoss
ds-source = "exec:3:%util-dir%/rtt %host%"
desc = "The percent of packet loss (normally 10 packets sent
per monitoring interval of %rrd-poll-interval%
seconds)"
rrd-ds-type = GAUGE
# Round Robin Array definitions for RRDtool...
#
# The RRA dictionary specifies the config of the datafiles on disk.
# The second field (0.5, below) is NEW as of RRD 0.99. It
# is the xfiles-factor, which used to be in the target
# dictionary.
# one point per 1 minute, spanning 50 hours
rra fine1minAve AVERAGE:0.5:1:3000
# one point per 5 minutes, spanning 50 hours
rra fine5minAve AVERAGE:0.5:5:600
# one point per 30 minutes, spanning 12 days
rra fine30minAve AVERAGE:0.5:6:600
# one point every 2 hours, spanning 50 days
rra fine2hrAve AVERAGE:0.5:24:600
rra fine2hrMax MAX:0.5:24:600
# one point every day, spanning 600 days
rra fine1dayAve AVERAGE:0.5:288:600
rra fine1dayMax MAX:0.5:288:600
# we will add datasources to each specific target-type later
targetType --default--
rra = "fine1minAve, fine5minAve, fine30minAve, fine2hrAve, fine2hrMax, fine1dayAve, fine1dayMax"
targetType net-monitor
ds = "percentLoss, minRTT, avgRTT, maxRTT"
view = "PercentLoss: percentLoss,
RTT: minRTT avgRTT maxRTT"
graph minRTT
units = "ms"
draw-as = LINE1
y-axis = "milliseconds (ms)"
y-min = 0
legend = "Minimum packet Round Trip Time"
graph avgRTT
units = "ms"
draw-as = LINE1
y-axis = "milliseconds (ms)"
y-min = 0
legend = "Average packet Round Trip Time"
graph maxRTT
units = "ms"
draw-as = LINE1
y-axis = "milliseconds (ms)"
y-min = 0
legend = "Maximum packet Round Trip Time"
graph percentLoss
units = "%"
color = red
draw-as = AREA
legend = "Percent Packet Loss"
y-axis = "Percent (%)"
# fixed y-axis, since this is a percentage
y-min = 0
y-max = 100
====================
#! /bin/sh
#
# rtt.sh - collect round-trip-time stats
#
#ident "@(#)cricket:$Name$:$Id$"
# Usage:
#
# rtt host|IP#
#
# This script munges the output of ping
# into a format suitable for Cricket's EXEC function.
#
# In order to measure the latency and loss of a network connection it
# is important to choose a remote host that does not itself suffer
# lack of resources and which will not artificailly demonstrate hight
# latency or packet loss when it is under load. Cisco (and probably
# most other sophisticated) routers are very bad choices because of
# this reason (unless they are indeed almost 100% idle).
#
# the ouptput consists of:
#
# line value description
#
# 0 1.679 minimum rtt in ms
# 1 1.679 average rtt in ms
# 2 1.679 maximum rtt in ms
# 3 0.0 percent packet loss
# Select an appropriate version of ping and give any default options
# Example for Eric Wassenaar's ping
# (available from ftp://ftp.nikhef.nl/pub/network/ping.tar.Z)
#
# This is the most preferred version of ping because of the '-F' option.
#
# Take care to set the timeout in ping to the lowest value that makes
# sense. If you're on a fast, stably routed network, and the routers
# don't have much buffering and there aren't many hops then one second
# is more than long enough. If you're on a busy PPP link you might
# need to increase it to as much as 5 seconds.
#
# (FIXME: the ping timeout should be settable on the command line.)
#
# Don't do DNS lookups if they'll take too much time.
#
# Example stats display:
#
# 10 packets transmitted, 10 packets received, 0% packet loss
# round-trip (ms) min/avg/max = 226/257/329 (std = 27.1)
#
if [ $(hostname) = "becoming" ] ; then
#
# because some of the networks we ping are not routed through
# the tunnel, but rather are NAT'ed directly onto the provider
# network we must be careful to only generate ICMP from the
# same host all the time else various monitoring systems will
# clash in the NAT table (ICMP can only be mapped between one
# inside host and one outside host at a time)....
#
# I'll worry about the potential security issues later! ;-)
#
PING="rsh -l woods proven /usr/local/sbin/ping -n -F -t 1"
else
PING="/usr/local/sbin/ping -n -F -t 1"
fi
PINGRTTMINFIELD=3
PINGRTTAVGFIELD=4
PINGRTTMAXFIELD=5
PINGPACKETOPTS="END"
# Example for BSD (and Linux since most distribs borrow the BSD one)
#
# Most versions of BSD ping don't have a "fast" option so be careful
# how many hosts you graph with this (or reduce the PKCOUNT below and
# suffer from lower-resolution loss values)! If you are in a highly
# trusted environment and you can run cricket as root then you could
# add the '-i 0.01' option to simulate flood ping speeds....
#
# Example of the new NetBSD version stats display:
#
# 5 packets transmitted, 5 packets received, 0.0% packet loss
# round-trip min/avg/max/stddev = 14.757/45.524/795.846/29.792 ms
#
#PING="/sbin/ping -n -w 1"
#PINGRTTMINFIELD=4
#PINGRTTAVGFIELD=5
#PINGRTTMAXFIELD=6
#PACKETOPTS="FLAGS"
#
# Example of the old standard BSD version stats display:
#
# 11 packets transmitted, 10 packets received, 9% packet loss
# round-trip min/avg/max = 80.919/101.680/142.769 ms
#
#PING="/sbin/ping -n -w 1"
#PINGRTTMINFIELD=3
#PINGRTTAVGFIELD=4
#PINGRTTMAXFIELD=5
#PACKETOPTS="FLAGS"
# Example for Solaris
#
# Solaris ping doesn't have a "fast" option so be careful how many
# hosts you graph with this! Get Eric Wassenaar's ping and use it!
#
# Example stats display:
#
# 5 packets transmitted, 5 packets received, 0% packet loss
# round-trip (ms) min/avg/max = 68/72/80
#
#PING="/usr/sbin/ping -n"
#PINGRTTMINFIELD=3
#PINGRTTAVGFIELD=4
#PINGRTTMAXFIELD=5
#PINGPACKETOPTS="END"
# pick your fastest, smallest, awk:
#
#AWK=awk
#AWK=nawk
#AWK=gawk
AWK=mawk
# FIXME: these should be settable on the command line...
#
# Note: 10 is a good number for %loss calc, and 56-byte data segments
# match the default ping packet (for a total of 64 bytes). We have to
# specify both because ancient versions of ping, such as the one
# Solaris still uses, don't have '-c' and '-s', but rather have
# parameters [size [count]] after the hostname....
#
PKCOUNT=10
PKSIZE=56
argv0=`basename $0`
if [ $# -ne 1 ] ; then
echo "Usage: $argv0 host|IP"
exit 2
fi
HOST=$1
#
case $PINGPACKETOPTS in
FLAGS)
PING_CMD="$PING -c $PKCOUNT -s $PKSIZE $HOST"
;;
END)
PING_CMD="$PING $HOST $PKSIZE $PKCOUNT"
;;
*)
echo "$argv0: internal configuration error!" 1>&2
exit 1
;;
esac
$PING_CMD 2>/dev/null |
$AWK '
BEGIN {
FS="/";
minrtt = "";
avgrtt = "";
maxrtt = "";
prcntloss = "";
}
/packet loss/ {
narr = 0;
lossarr[0] = "";
narr = split($0, lossarr, ",");
sub(/% packet loss/, "", lossarr[narr]);
}
NF > 1 && $1 ~ /round-trip/ {
minrtt = $'${PINGRTTMINFIELD}';
sub(/[^0-9.]* = /, "", minrtt);
avgrtt = $'${PINGRTTAVGFIELD}';
maxrtt = $'${PINGRTTMAXFIELD}';
sub(/ .*$/, "", maxrtt);
}
END {
if (minrtt == "") {
minrtt = "U";
}
if (avgrtt == "") {
avgrtt = "U";
}
if (maxrtt == "") {
maxrtt = "U";
}
printf("%s\t%s\n", minrtt, "minimum rtt in ms");
printf("%s\t%s\n", avgrtt, "average rtt in ms");
printf("%s\t%s\n", maxrtt, "maximum rtt in ms");
printf("%s\t%s\n", prcntloss, "percent packet loss");
}
'
exit 0
--
Greg A. Woods
+1 416 218-0098 VE3TCP <gwoods@...> <robohack!woods>
Planix, Inc. <woods@...>; Secrets of the Weird <woods@...>
|