From: Ivan B. <iv...@cs...> - 2006-05-02 05:15:23
|
Hi Markus, Thanks for trying the protocol and for the good discussion. My replies are interleaved below. On 5/1/06, Markus Koetter <koe...@gm...> wrote: > Hi, > > I compiled vfer current cvs on a debian & gentoo host in my lan, and hit > some strange borders. > > both boxes are single cpu boxes, so smp/ht here. > > I was able to transfer a 1mb file at something like 22mbit, larger files > like 100mb or 300mb would either deadlock the server, the client, > or interrupt with an 'error', the server claims the client is guilty, > the client disconnects with an error. > > I already profiled the code, and checked where to find the deadlocks, > (not how to defeat them), so if these problems are not known, drop me a > line and I'll followup with a more complete bugreport. I'm working on getting rid of these before the next release which will come out this week. Its a known issue but it has less to do with deadlocks and more to do with the logic of the protocol. I've observed deadlock scenarios in which RTT estimation fails to approximate a reasonable number for example and the two endpoints fall out of sync and take an effectively infinite time to complete. Packets are streaming (if you use the -v option) but data packets aren't being sent. If you turn off the congestion control, transfers complete but with intermediate congestion collapse. > From my point of view the whole threading used for the socketio is far > to complex for the task itself, incomplete and therefore pretty error > prone, for example Yes, this is definitely a handicap of the implementation, but it also has some benefits which I won't go into. We're hoping to find a student for Google's summer of code 2006 to experiment with non threaded alternatives and perform performance measurements. > > grep pthread_mutex_trylock * -R | wc > 0 0 0 The impl only uses blocking mutex acquisitions. The short story is that there are two sets of mutexes, those that control bins that receive packets and those that control access to the sockets array. Receiving packets into and reading from the same bin makes more sense with a blocking mutex, and socket array access likewise doesn't benefit from a non blocking trylock call. The sockets array mutexes haven't been tested in full since one socket suffices to find the important bugs right now. The bin mutexes have been thoroughly tested. > > no checking for *possible* deadlocks > > grep pthread_cond_ * -R > src/api.c: pthread_cond_init(&skt->cond_readable, NULL); > src/globals.h: pthread_cond_t cond_readable; > src/globals.h: pthread_cond_t cond_writeable; > src/globals.h: pthread_cond_t cond_exception; None of these are actually being used. These are there for the vfer_select() call which is not fully implemented right now. Condition vars are not being used anywhere in the active code. > src/packet.h:// pthread_cond_t cond; This one is commented out and is part of an older 'vision'. > my approach was to rewrite the socket io threadless and nonblocking, as > the whole protocol implementation is useless (as it can neither be > measured nor improved), if the socketio is unreliable due to threading > problems. As I said above, hopefully we can experiment this summer. Thanks very much! ivan. |