From: <no...@so...> - 2001-09-28 17:55:31
|
Bugs item #465765, was opened at 2001-09-27 12:10 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=110894&aid=465765&group_id=10894 Category: 24. Channel System Group: 8.3.3 Status: Open Resolution: None Priority: 7 Submitted By: Dale Talcott (daletalcott) Assigned to: Andreas Kupries (andreas_kupries) Summary: zero-length write considered harmful Initial Comment: Tcl 8.3.3, OS Platform AIX 4.3.3 There is an unfortunate interaction between tcl and Unix systems that implement the Single Unix specification. In particular, the Single Unix specification for writing to/reading from a STREAM device requires that a zero-length write on one side show up as a zero-length read on the other. That is, a zero-length write gets interpreted as EOF when read. http://www.opengroup.org/onlinepubs/007908799/ xsh/write.html http://www.opengroup.org/onlinepubs/007908799/ xsh/read.html This is a problem for tcl because when tcl flushes its channel buffers, if there is nothing in the buffer, tcl ends up performing a zero-length write. If the channel happens to be connected to a STREAMS file, this appears as EOF to the other side of the STREAM. This causes grief for several programs, including script, sshd, and expect. It has the interesting effect that, if you are logged on via SSH and run the tcl test suite, you get logged off! during io.test. Fortunately, there are not that many OS's that implement pseudo-tty's on STREAMS, but IBM's AIX 4.3.3 is one. One can argue that the spec is wrong for changing what used to be a harmless action (writing nothing) into something that now has a meaning. However, this is probably a losing battle. I suggest that tcl be modified to avoid zero-length writes. The following patch changes the Unix FileOutputProc to avoid zero-length writes: --- unix/DRT/tclUnixChan.c Tue Apr 3 17:54:39 2001 +++ unix/tclUnixChan.c Thu Sep 27 14:04:58 2001 @@ -423,6 +423,8 @@ int written; *errorCodePtr = 0; + if (toWrite == 0) + return 0; /* avoid zero-length write */ written = write(fsPtr->fd, buf, (size_t) toWrite); if (written > -1) { return written; This has been tested only to the extent that the tcl tests now run on AIX without logging me off. ---------------------------------------------------------------------- >Comment By: Andreas Kupries (andreas_kupries) Date: 2001-09-28 10:55 Message: Logged In: YES user_id=75003 Regarding Don's question: Correct, any patch to avoid zero- length writes (either on all channels or possibly streams- based ones = pipes, sockets) will change the given statement into a full no-op. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2001-09-28 10:53 Message: Logged In: YES user_id=75003 with regard to my own patch the handling of errorCode is this: In FlushChannel the errorCode is initialized to zero, but not in Tcl_WriteRaw. In both cases it won't be used for 'written == 0'. Question: Is it possible to query the OS if a there is a STREAM behind a file descriptor ? If yes avoidance of zero- length writes could be restricted to this specific situation. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2001-09-28 10:51 Message: Logged In: YES user_id=80530 I sometimes use [catch {puts -nonewline $socket ""}] as a way to test that a socket connection is valid without actually writing to it. It sounds like this patch will turn that command into a no-op? ---------------------------------------------------------------------- Comment By: Jeffrey Hobbs (hobbs) Date: 2001-09-28 10:34 Message: Logged In: YES user_id=72656 The enclosed patch is definitely not correct. If you look at the TcpOutputProc in tclUnixChan.c, you'll see that it is possible for errorCode to be set even before a write is attempted. Also, errorCode will be used when written == 0, but it was never set. Thus, at least errorCode must also be set, but perhaps we are still overlooking possible valid side-effects (like acceptable errors returned with write 0). ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2001-09-28 10:28 Message: Logged In: YES user_id=75003 Enclosed is the patch I made. Jeff pointed out that there are possibly subtle problems with the patch as is: When writing to an OS channel an error like EPIPE can be returned. The specification and the manpage are not clear if this can happen for a zero-length write too. If that is so the patch changes the tcl level semantics of something like [puts - nonewline $chan ""]. Before the patch this command can fail, afterwards it won't (assuming a valid $chan). This is something any patch to avoid zero-length writes can run into. Opinions ? ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2001-09-28 10:21 Message: Logged In: YES user_id=75003 The seg.fault goes away with the fix for bug [#465494] "I/O segfaults in test suite". IOW, it was the same bug, as I suspected. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2001-09-27 17:03 Message: Logged In: YES user_id=75003 Don, after doing a simple patch to avoid 0-write's I get segfaults on the AS linux box too. I will see later if the same happens on sparky and my system at home. Stacktrace: 0 0x40095e8b in chunk_alloc (ar_ptr=0x4012ad40, nb=32) at malloc.c:2875 #1 0x400955ae in __libc_malloc (bytes=21) at malloc.c:2696 #2 0x80f50c9 in TclpAlloc (nbytes=21) at ../tcl/unix/../generic/tclAlloc.c:672 #3 0x80686dc in Tcl_Alloc (size=21) at ../tcl/unix/../generic/tclCkalloc.c:983 #4 0x80a33ec in AllocStringEntry (tablePtr=0x8160958, keyPtr=0x8158460) at ../tcl/unix/../generic/tclHash.c:927 #5 0x80a2d28 in Tcl_CreateHashEntry (tablePtr=0x8160958, key=0x8158460 "http", newPtr=0xbfffaf4c) at ../tcl/unix/../generic/tclHash.c:450 #6 0x80d7dde in TclSetElementOfIndexedArray (interp=0x810fa78, localIndex=3, elemPtr=0x818e848, newValuePtr=0x819ca98, flags=512) at ../tcl/unix/../generic/tclVar.c:1746 #7 0x808f68a in TclExecuteByteCode (interp=0x810fa78, codePtr=0x816cde8) at ../tcl/unix/../generic/tclExecute.c:1251 #8 0x806479a in Tcl_EvalObjEx (interp=0x810fa78, objPtr=0x81952c8, flags=0) at ../tcl/unix/../generic/tclBasic.c:2951 #9 0x80cb139 in TclObjInterpProc (clientData=0x8163eb0, interp=0x810fa78, objc=2, objv=0x8112080) at ../tcl/unix/../generic/tclProc.c:1075 #10 0x808d968 in TclExecuteByteCode (interp=0x810fa78, codePtr=0x81990a8) at ../tcl/unix/../generic/tclExecute.c:869 #11 0x806479a in Tcl_EvalObjEx (interp=0x810fa78, objPtr=0x81a2f08, flags=131072) at ../tcl/unix/../generic/tclBasic.c:2951 #12 0x80ad988 in TclChannelEventScriptInvoker (clientData=0x81a96c0, mask=2) at ../tcl/unix/../generic/tclIO.c:7084 #13 0x80ad319 in Tcl_NotifyChannel (channel=0x819a3e8, mask=2) at ../tcl/unix/../generic/tclIO.c:6620 #14 0x80e4452 in FileHandlerEventProc (evPtr=0x815abd0, flags=-3) at ../tcl/unix/../unix/tclUnixNotfy.c:624 ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2001-09-27 14:02 Message: Logged In: YES user_id=80530 Ah. So that explains it! And here I was blaming a bug in SSH. I've also been bitten by this, trying to work with Tcl through SSH on an AIX machine. I defer to AK's judgment on the solution. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2001-09-27 13:26 Message: Logged In: YES user_id=75003 I believe that this is something the drivers should not need to be aware about. In other words, I believe that this should be fixed in the generic part of the I/O system. I'll try this and report back. ---------------------------------------------------------------------- Comment By: Vince Darley (vincentdarley) Date: 2001-09-27 12:18 Message: Logged In: YES user_id=32170 This is really a channels problem, and not in my area of understanding or expertise. Assigned to Andreas. ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=110894&aid=465765&group_id=10894 |