I've learned a bit more about the problem.
First of all, it's not mbufs. I created a test version of tbd that
duplicates each packet many times, and recreated the error with just
a single user. The peak mbufs usage stat didn't go up at all, much
less approach the system limit. In fact, it turns out that a single
burst of datagrams sent quickly is enough to trigger the problem. How
many datagrams in the burst depends on the datagram size. If the
datagrams are very short, just over 128 of them can be queued in a
row before getting ENOBUFS.
If not mbufs, it must be a limit on the outgoing queue. It would be
reasonable to have such a limit, since UDP doesn't otherwise have any
flow control. However, on amsat.org that limit seems to be annoyingly
low.
It seems to be a FreeBSD limitation, and as far as I can tell there's
no way to tune it. There is only so much buffer space available for
outgoing UDP datagrams.
setsockopt(2) can tune the receive buffer size using SO_RCVBUF.
There's even a corresponding-looking option called SO_SNDBUF.
However, that option doesn't tune the total buffer size at all, it
tunes the maximum size of a single outgoing datagram. I verified this
by experiment.
sysctl(8) can tweak hundreds of system parameters, including one
called net.inet.udp.recvspace, which again controls receive
buffering. But despite some mentions on net forums, there does not
seem to be a parameter called net.inet.udp.sendspace, at least in
FreeBSD 4.8-RELEASE. One hypothesis I saw mentioned is that UDP
datagrams are queued directly into the hardware, so there's no
software buffer limit to tune.
A little test program that does nothing but set up a socket and send
a single burst of packets to it shows exactly the same behavior, so
it's nothing specific to tbd.
I think there's a possible solution. Send2() could simply retry
datagrams that get ENOBUFS. I'm not sure whether a full-speed spin
loop or a spinloop that calls nanosleep(2) would be best. And I'm
worried that bigger architectural changes will be needed to recover
from the case where network bandwidth really is the limit (a new
voice frame arrives before the old one has been completely sent out).
I'm not sure what currently happens under those conditions.
If I set my test program to send a burst of 380 (42 byte payload,
which is about what 16kbps works out to) without any waiting, about
2/3 of them succeed and it takes about 20ms (that's a voice frame
time if I'm not mistaken). So around 250 is going to be an upper
bound on how many connections can be supported on this system before
packets start to drop. I could live with that.
Sending those test packets through the net via my cable modem and
Linksys router to my Powermac G5, only about 150 actually arrive as
logged by Ethereal. Not sure what happened to the others. That part
isn't a very realistic test, of course.
I'm out of time for playing with this until probably Monday.
Any better ideas would be welcome.
73 -Paul
kb5mu@...
|