From: <no...@so...> - 2002-01-03 11:03:59
|
Bugs item #494348, was opened at 2001-12-17 14:06 You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=110894&aid=494348&group_id=10894 Category: 15. Commands A-H Group: = 8.3 Status: Open Resolution: Works For Me >Priority: 7 Submitted By: Nobody/Anonymous (nobody) Assigned to: Donal K. Fellows (dkf) Summary: foreach crashes in recursion Initial Comment: - Tcl Version: 8.x+; - OS: Solaris 5.x (Any); - Compiler: Sun CC 6.0 Upd 1; - Problem behavior: Tcl may freely convert the type of objects. Foreach command stores the pointers to objects of loop variables at the begin of the loop: for (i = 0; i < numLists; i++) { result = Tcl_ListObjGetElements(interp, argObjv[1+i*2], &varcList[i], &varvList[i]); -> Var list is in varvList. Later it only checks if the variable list object has the list type: if (argObjv[1+i*2]->typePtr != &tclListType) result = Tcl_ListObjGetElements(interp, argObjv[1+i*2], -> conditionally recreates the list and if the type is "list", it considers the list correct. Now suppose the loop body calls the function with the same name as the loop variable. This will convert the var list object to "cmdName" type and lose the pointers to all loop variable objects. Then, in recursion, we wil recreate the var list conversting "cmdName" to "list". After return from recursion to the first loop we will have the correct list but with another objects. After checking the type of var list object (it is "list" now) we'll continue to use stored pointers (varvlist) and will crash on the second iteration. You will not crash if nobody used that freed memory, but it does not change anything. It's a real critical bug for us. To fix it I made the second list parsing (Tcl_ListObjGetElements) unconditional. Short testcase (I know it might be shorter): -------------- proc inst {args} { } proc Something {} { inst } proc Test {level} { incr level puts "Test: level = $level" #inst "a" Something ;# you may call inst directly if {$level == 1} { set instlist {1 2} } else { set instlist {} } puts "instlist = /$instlist/" foreach inst $instlist { puts "inst = /$inst/" Test $level } } Test 0 -------------------- Regards, Yevgen Ryazanov ---------------------------------------------------------------------- >Comment By: Donal K. Fellows (dkf) Date: 2002-01-03 03:03 Message: Logged In: YES user_id=79902 Have found a way to duplicate the problem; with the tcltest binary (i.e. using the standard 'runtest' makefile target) replace the 'Test 0' at the end of the above script with: testcmdtrace tracetest {Test 0} This is now a reproducable core fault; we can hunt it down properly! ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2001-12-20 12:26 Message: Logged In: YES user_id=79902 Ah, that explains why a normal script couldn't activate the problem (the Tcl_CreateTrace API is not used by the core.) Sorting this may take some time... :^( ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-12-19 18:49 Message: Logged In: NO Donal, you are right. To crash the application we need one more condition: a trace function. To do it, just add empty trace function into Tcl_AppInit: Tcl_CreateTrace (interp, 10, EmptyProc, NULL); This will change the execution flow (tclExecute.c): DECACHE_STACK_INFO(); CallTraceProcedure(interp, tracePtr, cmdPtr, cmd, numChars, objc, objv); CACHE_STACK_INFO(); Regards, Yevgen ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2001-12-19 08:04 Message: Logged In: YES user_id=79902 I'd still love to see a way to reproduce this in a standard shell (I *can't* cause it to crash with the above code, even if I force it to use the code in tclCmdAH.c instead of the bytecodes) if for no other reason than to form the basis of a test so we can make sure that this fault can't recur. Still, a minimal attempt at fixing it would involve doing Tcl_IncrRefCount() on all the arguments to foreach before the iteration, and Tcl_DecrRefCount() on them all prior to exit from the function. The primary alternative is a heavyweight copy, and that would make the performance suck for what is, you must admit, a very rare fault (if it was common, I'd be able to dup it without problems... :^) Key questions for you, Yevgen: 1) Does the above code cause a failure for you in Tcl 8.3.4 without any extensions? (It might have been due to a bug fixed between 8.3.0 and 8.3.4, if my understanding of what you're using is correct.) 2) Does the above code cause a failure for you in Tcl 8.4a4 (CVS) without any extensions? (This is optional, but helpful since this is the version I've normally got built with debugging. :^) 3) The shell you are using; does it include any extensions at all? 4) Can you try it out with Purify? (Purify is wonderful, or maybe even better than that, but my department let our license expire. Silly people!) 5) Bear in mind that when you've got memory corruption or overzealous releasing of object references, the way it can show up in a running Tcl program can be *extremely* odd. (I should know... :^) ---------------------------------------------------------------------- Comment By: Jeffrey Hobbs (hobbs) Date: 2001-12-18 16:20 Message: Logged In: YES user_id=72656 I couldn't repro this with 8.3.4 on Sol2.8/gcc, AIX-4.3- RS6000/xlc, HP-11-pa2/HPcc or Win32/VC++. On Solaris I even tried the LD_PRELOAD (although I compiled with gcc). The logic sounds OK, but I didn't have the time to look at the code. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2001-12-18 15:59 Message: Logged In: NO Hi Donal, it's not a guess. It's a fact. I found it after 2 days of debugging. I was a real crash. Only after that I created a testcase in pure tcl. It's quite simple to see what happens if you use a debugger. Unfortunately, our application is built with Sun compiler, and I use Sun workshop/dbx for debugging, and I am not familiar with gdb. But it might be simple to see in an debugger that after the first cycle the objects stored in varvList are freed, but it will be used in the loop. We had the simular crash in tk level. I did not debug that case (the crash was exacly in the same place), but it disappered after my fix, and you could see the crash are simular. In GUI the crash happened when we forgot to add break in bind command, and tk hot-key handler executed foreach loop for all bindings. I found a suspicious place in tk/library, they use 'foreach trace ...' and tcl has trace command (it's not a problem now). I will try to find a better testcase, but it's not the most urgent problem for our project. By the way, to make crashes more predictable I did setenv LD_PRELOAD watchmalloc.so.1 Please, do not close the bug, I hope you will see it's real. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2001-12-18 01:39 Message: Logged In: YES user_id=79902 Can't reproduce any problems with above script on either 8.3.2 or 8.4a4(CVS HEAD). I think I can *sort of* see how there might be a problem (if the object can somehow become a different list) but haven't been able to discover a way to trigger it. Without any way to actually duplicate the problem, I can't justify going further as there would be a significant performance downside to the things I might try like regenerating the whole list each time. Or I could just use Tcl_{Incr,Decr}RefCount on the lists, which would work quite well too as nothing is allowed to mess around with anything with refcount>1. Submit a script that reliably demonstrates the problem in either 8.3.4 or any 8.4 version and I'll do what I can... If you're linking any additional code, it is possible that the real problem lies with that not respecting the refcount rules correctly. If that's the case, this isn't a Tcl bug at all. :^) ---------------------------------------------------------------------- You can respond by visiting: http://sourceforge.net/tracker/?func=detail&atid=110894&aid=494348&group_id=10894 |