From: SourceForge.net <no...@so...> - 2006-01-10 21:19:50
|
Bugs item #1329754, was opened at 2005-10-18 14:01 Message generated for change (Comment added) made by dgp You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=1329754&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 27. Channel Types Group: obsolete: 8.4.11 Status: Open Resolution: None Priority: 5 Submitted By: Don Porter (dgp) Assigned to: Andreas Kupries (andreas_kupries) >Summary: sockets lose data on Windows Initial Comment: Attached script is demo of the problem. Start it in one shell window: tclsh sdtest.tcl server to start a server running. Start it in a second window: tclsh sdtest.tcl client to start a client of that server. The server shell window should print a sequence of messages it received from the client, starting with message 10 count down to 1. This works fine on linux and solaris. On windows, the output does not reach message 1, but craps out about 4 or 5 messages short of all the data. I did Windows testing with the ActiveTcl 8.4.11.2 tclsh. A minor change to the client part of the demo script, so that the socket is explicitly made blocking *before* the final call to [flush], and the bug is worked around and all data passes through on Windows. This should not be required. Sockets should not lose data on any platform. ---------------------------------------------------------------------- >Comment By: Don Porter (dgp) Date: 2006-01-10 16:19 Message: Logged In: YES user_id=80530 did some more testing, and even in the case where the socket is never made non-blocking, data can still be lost if the client side does not perform an explicit [close]. Revised summary to reflect that non-blocking is not essential to demo of the bug. So why would an explicit [close] differ from the Tcl_Close() that ought to be implicit in finalization? ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2006-01-10 16:11 Message: Logged In: YES user_id=80530 looking at this again, it appears that what it required on the client side to avoid data loss is *both* an [fconfigure -blocking 1] and an explicit [close]. If the client side is left non-blocking data is lost. If the [close] command is not explicitly done, then the implicit close that should happen during [exit] loses data too. Note that it's all changes on the client side of the connection that make a difference. Configuring the server side doesn't seem to play a role at all, which suggests to me the problem is not with the read side of things. ---------------------------------------------------------------------- Comment By: David Gravereaux (davygrvy) Date: 2005-10-19 05:44 Message: Logged In: YES user_id=7549 There is an odd situation with the generic layer where if an amount of read() operations caused by a given [gets] call consumes EOF to the generic layer it ends up being the responsibility of the channel driver to continue firing readable operations on the channel until it is closed. IMO, EOF had already been read into the generic layer and given it's knowledge of EOF, shouldn't the channel driver's job be done regarding notification? And shouldn't it be the generic layer's responsibility to fire off readable instead? Honestly, this is quite inefficient when the channel driver will never expect anymore system notifications for that socket anymore and needs manufacture them just for this situation. I'm not sure if this relates, though. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2005-10-18 14:54 Message: Logged In: YES user_id=80530 same problem in the oldest ActiveTcl I found, 8.3.3 from April 2001. Looks like flushing non-blocking sockets on Windows has just been broken for a long, long time. ---------------------------------------------------------------------- Comment By: David Gravereaux (davygrvy) Date: 2005-10-18 14:50 Message: Logged In: YES user_id=7549 I do not have any development tools to work on this today. reassigning to another. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2005-10-18 14:49 Message: Logged In: YES user_id=80530 Same problem present in the Oct. 2002 ActiveTcl 8.4.0 release. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2005-10-18 14:43 Message: Logged In: YES user_id=80530 speculation appears to be false. ActiveTcl 8.4.7 has the same problem, and that's before the 847693 changes happened. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2005-10-18 14:32 Message: Logged In: YES user_id=80530 Speculation this may be releated to 947693 ? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=1329754&group_id=10894 |