From: SourceForge.net <no...@so...> - 2010-07-30 17:29:48
|
Bugs item #3034840, was opened at 2010-07-26 11:28 Message generated for change (Comment added) made by andreas_kupries You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3034840&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 27. Channel Types Group: development: 8.6b1.1 Status: Open Resolution: None Priority: 9 Private: No Submitted By: Emiliano (egavilan) Assigned to: Andreas Kupries (andreas_kupries) Summary: Crash in UpdateStringOfList() Initial Comment: When trying to run the code on http://wiki.tcl.tk/26735 with Tcl HEAD (updated today), using the wish8.6 interpreter, I can crash the program by closing the wish window with the mouse. $ gdb wish8.6 GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-slackware-linux"... (gdb) run memchan_2.tcl Starting program: /usr/local/bin/wish8.6 memchan_2.tcl [Thread debugging using libthread_db enabled] [New Thread 0xb7a25ab0 (LWP 555)] [New Thread 0xb79eeb90 (LWP 558)] Writing data into rc0 Reading data from rc1 **Hello World** Writing data into rc1 Reading data from rc0 **Hello There !!** <=== Close the "." wish window here Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xb7a25ab0 (LWP 555)] 0xb7eacbed in UpdateStringOfList (listPtr=0x8072ab8) at /home/emiliano/src/tcl/generic/tclListObj.c:1865 1865 { (gdb) bt #0 0xb7eacbed in UpdateStringOfList (listPtr=0x8072ab8) at /home/emiliano/src/tcl/generic/tclListObj.c:1865 #1 0xb7eb784f in Tcl_GetString (objPtr=0x8072ab8) at /home/emiliano/src/tcl/generic/tclObj.c:1636 #2 0xb7eb78d1 in Tcl_GetStringFromObj (objPtr=0x8072ab8, lengthPtr=0xbf2930d4) at /home/emiliano/src/tcl/generic/tclObj.c:1676 #3 0xb7eacc9e in UpdateStringOfList (listPtr=0x8072ab8) at /home/emiliano/src/tcl/generic/tclListObj.c:1893 #4 0xb7eb784f in Tcl_GetString (objPtr=0x8072ab8) at /home/emiliano/src/tcl/generic/tclObj.c:1636 #5 0xb7eb78d1 in Tcl_GetStringFromObj (objPtr=0x8072ab8, lengthPtr=0xbf2931c4) at /home/emiliano/src/tcl/generic/tclObj.c:1676 #6 0xb7eacc9e in UpdateStringOfList (listPtr=0x8072ab8) at /home/emiliano/src/tcl/generic/tclListObj.c:1893 #7 0xb7eb784f in Tcl_GetString (objPtr=0x8072ab8) at /home/emiliano/src/tcl/generic/tclObj.c:1636 #8 0xb7eb78d1 in Tcl_GetStringFromObj (objPtr=0x8072ab8, lengthPtr=0xbf2932b4) at /home/emiliano/src/tcl/generic/tclObj.c:1676 #9 0xb7eacc9e in UpdateStringOfList (listPtr=0x8072ab8) at /home/emiliano/src/tcl/generic/tclListObj.c:1893 Note that group #1, #2 and #3 repeats in #4, #5 and #6, and then the backtrace output repeats endlessly with these three lines. The problem doesn't appear when using the tclsh intepreter (I can interrupt the program with Ctrl-C, but it finalize without a crash. The system is Linux 2.6.27.7 ---------------------------------------------------------------------- >Comment By: Andreas Kupries (andreas_kupries) Date: 2010-07-30 10:29 Message: > It is not safe to call InputTclMethod() with > zero ref count arguments, but that's what > ReflectWatch() does. Hm. Going through this I cannot see how that (maskObj with RC == 0) happens. The maskObj is created by DecodeEventMask() and this function runs Tcl_IncrRefCount immediately on it, in line 1975, causing RC == 1. I am looking at revision 1.49 of tclIORChan.c That being said, I do see that ReflectInput/Output and SeekWide invoke ITM with newly minted objects which have RC == 0. (Still have to look at Block and (Set/Get)Option). ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2010-07-30 07:53 Message: dgp - This is just a note that I have seen this thread. Thank you for your analysis. I'll do a review as quickly as possible. Note: tclIORTrans.c is structurally similar to tclIORChan.c which this bug is apparently about. As such it may have the same problem. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 16:55 Message: Bottom line is that InvokeTclMethod and its callers need a complete refCount management analysis and fixup. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 16:49 Message: Come up on the chat that just adding a Tcl_IncrRefCount() isn't good enough to fix things when the rcPtr->interp == NULL case is in effect. Then we still would get two Decrs after one Incr and the same brokenness ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 16:39 Message: Anyhow, just to confirm that is the root of the problem, I added the Tcl_IncrRefCount() call and the bug goes away. Leaving with aku to choose what fix is preferred. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 16:35 Message: It is not safe to call InputTclMethod() with zero ref count arguments, but that's what ReflectWatch() does. The "design" of this code isn't fully clear to me, so I don't know whether the right fix is to add a Tcl_IncrRefCount() before that call (and check all the similar calls in the same source code file), or to revise InvokeTclMethod() itself. ---------------------------------------------------------------------- Comment By: Alexandre Ferrieux (ferrieux) Date: 2010-07-29 16:35 Message: in InvokeTclMethod, dead channel case, why are argOne and argTwo decr-ref'ed ? sinner candidate ? ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 16:13 Message: Much worse than that. Somehow the same Tcl_Obj is being returned to the tclFreeObjList twice in a row, which leaves it pointing to itself as the next Tcl_Obj memory chunk to allocate. Hilarity ensues. The code which is making the bad push back onto the free list is the Tcl_DecrRefCount(maskObj); in ReflectWatch(), but it's not yet clear whether it's the sinner or the victim of even earlier corruption. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 14:36 Message: The bug is almost certainly in the list surgery done in FixLevelCode(). ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 12:05 Message: The invalid value traces back to the Tcl_SetReturnOptions() call in UnmarshallErrorResult(). Since I don't see anything obviously wrong there, I suspect the error goes back further and a corrupt value is getting marshalled in the first place. Still tracking... ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 11:22 Message: That doesn't seem to be it. If I disable the optimized dict->list conversion code, the bug remains. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 11:07 Message: Quick scan of the changes file fingers this as the most likely suspect: 2008-07-20 (enhancement)[2008248] dict->list preserve item intreps (pasadyn) dkf? Any chance a bug in that change could cause internal rep corruption where a list tries to make an element of itself? And then any hints what could trigger that bug? ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 11:01 Message: Issue appears between the 8.5.3 and 8.5.4 releases of Tcl/Tk. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2010-07-29 06:53 Message: This is an infinite loop created because a tclListType value somehow has itself as its own second element! A list value is not supposed to be an element of itself, and it's clearly an invalid state for a multi-element list to claim to me an element of itself. The failure here is not in the tclListType machinery, but it whatever code corrupted this value into an invalid state. Just based on the players involved, I'd point fingers in the general direction of how (reflected?) channels interact with exit cleanup. Another approach might be to `git bisect` to discover when this failure first showed up. ---------------------------------------------------------------------- Comment By: Emiliano (egavilan) Date: 2010-07-27 08:05 Message: The problem doesn't happen when Tk is loaded dinamically into the tclsh interpreter via [package require Tk], only using the wish interpreter. Tried thread-enabled and thread-disabled builds, both crash. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2010-07-27 05:39 Message: I can reproduce this problem on Linux (Ubuntu), both with the 8.5 head and 8.6 head! ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2010-07-27 05:36 Message: Works for me with tip of 8.5 branch too. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2010-07-27 05:29 Message: Works for me with HEAD. Crashes in 8.5.2. (All on OSX/Intel.) ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3034840&group_id=10894 |