Re: [Planetlab-users] sysctl variables
Brought to you by:
alklinga
|
From: Neil S. <ns...@cs...> - 2003-11-15 22:42:02
|
On Nov 14, 2003, at 6:12 AM, Larry Peterson wrote: > In fact, as a generalization of this issue, some people would > like to run alternative versions of TCP that are tuned for > high-speed networks (e.g., FAST). One way of doing this, which > you do as well, is run a user-level version of TCP. Then you > can tweak it as much as you want. Of course, there's still going > to be a limit on how much shared bandwidth is available, but if > you selectively use nodes connected by fat pipes (e.g., the > Abilene nodes) you might be able to do what you want. Larry, (or anyone well-versed in the raw sockets code) As Manpreet's last mail suggests he's trying this approach, do you think a user-level TCP is sufficient to inflate buffer sizes? I understand that keeping retransmissions in user memory removes the need for a send buffer, but I'm worried about the fake receive buffer. If the raw socket kernel->application buffer is limited to the same size as the tcp buffer, then advertising a large receive window (now possible with the user-level TCP) without being able to provide it would lead to packet loss when the machine is busy, exactly what flow control is designed to avoid. Assuming I'm right about how raw socket buffers work, I'm wondering if a hybrid approach, inspired a bit by Semke, Mahdavi, and Mathis's sigcomm 98 paper on buffer autotuning, would work well -- raise the receive buffer size through the roof, and let the user tcp manipulate the send window. (In their paper, they focus on tuning the send buffer so as many connections as possible make good progress, but they leave the receive buffers as large as possible, expecting that congestion control will limit actual buffer consumption.) I don't know if a large kernel->user buffer is easy to configure for the raw socket code (or if it is already there and my concern is unwarranted). -neil musing that HZ=1000 might help user-level TCPs stay responsive on java-cpu-bound machines. |