Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#1707 foreach crashes in recursion

obsolete: 8.3
closed-fixed
miguel sofer
7
2002-01-04
2001-12-17
Anonymous
No

- Tcl Version: 8.x+;
- OS: Solaris 5.x (Any);
- Compiler: Sun CC 6.0 Upd 1;
- Problem behavior:
Tcl may freely convert the type of objects.
Foreach command stores the pointers to objects of loop
variables at the begin of the loop:
for (i = 0; i < numLists; i++) {
result = Tcl_ListObjGetElements(interp,
argObjv[1+i*2],
&varcList[i], &varvList[i]);
-> Var list is in varvList.
Later it only checks if the variable list object has
the list type:
if (argObjv[1+i*2]->typePtr != &tclListType)
result = Tcl_ListObjGetElements(interp,
argObjv[1+i*2],
-> conditionally recreates the list
and if the type is "list", it considers the list
correct. Now suppose the loop body calls the
function with the same name as the loop variable.
This will convert the var list
object to "cmdName" type and lose the pointers to all
loop variable objects. Then, in recursion, we wil
recreate the var list conversting "cmdName" to "list".
After return from recursion to the first loop we will
have the correct list but with another objects.
After checking the type of var list object (it is
"list"
now) we'll continue
to use stored pointers (varvlist) and will crash on the
second iteration. You will not crash if nobody used
that freed memory, but it does not change anything.
It's a real critical bug for us. To fix it I made
the second list parsing (Tcl_ListObjGetElements)
unconditional.
Short testcase (I know it might be shorter):
--------------
proc inst {args} {
}

proc Something {} {
inst
}

proc Test {level} {
incr level

puts "Test: level = $level"

#inst "a"
Something ;# you may call inst directly

if {$level == 1} {
set instlist {1 2}
} else {
set instlist {}
}
puts "instlist = /$instlist/"

foreach inst $instlist {
puts "inst = /$inst/"
Test $level
}
}

Test 0
--------------------
Regards,
Yevgen Ryazanov

Discussion

  • Logged In: YES
    user_id=79902

    Can't reproduce any problems with above script on either
    8.3.2 or 8.4a4(CVS HEAD). I think I can *sort of* see how
    there might be a problem (if the object can somehow become a
    different list) but haven't been able to discover a way to
    trigger it. Without any way to actually duplicate the
    problem, I can't justify going further as there would be a
    significant performance downside to the things I might try
    like regenerating the whole list each time. Or I could just
    use Tcl_{Incr,Decr}RefCount on the lists, which would work
    quite well too as nothing is allowed to mess around with
    anything with refcount>1.

    Submit a script that reliably demonstrates the problem in
    either 8.3.4 or any 8.4 version and I'll do what I can...

    If you're linking any additional code, it is possible that
    the real problem lies with that not respecting the refcount
    rules correctly. If that's the case, this isn't a Tcl bug
    at all. :^)

     
    • status: open --> pending-works-for-me
     
  • Logged In: NO

    Hi Donal, it's not a guess. It's a fact. I found it after 2
    days of debugging. I was a real crash. Only after that
    I created a testcase in pure tcl. It's quite simple to see
    what happens if you use a debugger.
    Unfortunately, our application is built with Sun compiler,
    and I use Sun workshop/dbx for debugging, and I am not
    familiar with gdb. But it might be simple to see in an
    debugger that after the first cycle the objects stored
    in varvList are freed, but it will be used in the loop.
    We had the simular crash in tk level. I did not debug
    that
    case (the crash was exacly in the same place), but it
    disappered after my fix, and you could see the crash
    are simular. In GUI the crash happened when we forgot
    to add break in bind command, and tk hot-key handler
    executed foreach loop for all bindings. I found a suspicious
    place in tk/library, they use 'foreach trace ...' and
    tcl has trace command (it's not a problem now).
    I will try to find a better testcase, but it's not the
    most urgent problem for our project.
    By the way, to make crashes more predictable I did
    setenv LD_PRELOAD watchmalloc.so.1
    Please, do not close the bug, I hope you will see it's real.

     
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2001-12-19

    • status: pending-works-for-me --> open-works-for-me
     
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2001-12-19

    Logged In: YES
    user_id=72656

    I couldn't repro this with 8.3.4 on Sol2.8/gcc, AIX-4.3-
    RS6000/xlc, HP-11-pa2/HPcc or Win32/VC++. On Solaris I
    even tried the LD_PRELOAD (although I compiled with gcc).
    The logic sounds OK, but I didn't have the time to look at
    the code.

     
  • Logged In: YES
    user_id=79902

    I'd still love to see a way to reproduce this in a standard shell (I *can't* cause it to crash with the above code, even if I
    force it to use the code in tclCmdAH.c instead of the bytecodes) if for no other reason than to form the basis of a test so
    we can make sure that this fault can't recur. Still, a minimal attempt at fixing it would involve doing Tcl_IncrRefCount() on
    all the arguments to foreach before the iteration, and Tcl_DecrRefCount() on them all prior to exit from the function. The
    primary alternative is a heavyweight copy, and that would make the performance suck for what is, you must admit, a very
    rare fault (if it was common, I'd be able to dup it without problems... :^)

    Key questions for you, Yevgen:
    1) Does the above code cause a failure for you in Tcl 8.3.4 without any extensions? (It might have been due to a bug fixed
    between 8.3.0 and 8.3.4, if my understanding of what you're using is correct.)
    2) Does the above code cause a failure for you in Tcl 8.4a4 (CVS) without any extensions? (This is optional, but helpful
    since this is the version I've normally got built with debugging. :^)
    3) The shell you are using; does it include any extensions at all?
    4) Can you try it out with Purify? (Purify is wonderful, or maybe even better than that, but my department let our license
    expire. Silly people!)
    5) Bear in mind that when you've got memory corruption or overzealous releasing of object references, the way it can
    show up in a running Tcl program can be *extremely* odd. (I should know... :^)

     
  • Logged In: NO

    Donal, you are right. To crash the application we need one
    more condition: a trace function.
    To do it, just add empty trace function into Tcl_AppInit:
    Tcl_CreateTrace (interp, 10, EmptyProc, NULL);
    This will change the execution flow (tclExecute.c):
    DECACHE_STACK_INFO();
    CallTraceProcedure(interp, tracePtr, cmdPtr,
    cmd, numChars, objc, objv);
    CACHE_STACK_INFO();
    Regards,
    Yevgen

     
  • Logged In: YES
    user_id=79902

    Ah, that explains why a normal script couldn't activate the problem
    (the Tcl_CreateTrace API is not used by the core.) Sorting this may
    take some time... :^(

     
    • priority: 5 --> 3
     
  • Logged In: YES
    user_id=79902

    Have found a way to duplicate the problem; with the tcltest binary (i.e. using the standard 'runtest' makefile target)
    replace the 'Test 0' at the end of the above script with:
    testcmdtrace tracetest {Test 0}

    This is now a reproducable core fault; we can hunt it down properly!

     
    • priority: 3 --> 7
     
  • Logged In: YES
    user_id=79902

    More and more mysterious; I've no idea how tracing is causing the recomputation of the internal rep of the 'inst'
    object. Wierd! Assigning to msofer for his input...

     
    • assigned_to: dkf --> msofer
     
  • miguel sofer
    miguel sofer
    2002-01-03

    Logged In: YES
    user_id=148712

    Really weird, I agree. How about this one: if you add an
    empty line to the body of procedure Test, the segfault
    disappers (on my setup, with tcl8.4 from HEAD).
    I'm still looking at this.

     
    • assigned_to: msofer --> dkf
    • status: open-works-for-me --> closed-fixed
     
  • Logged In: YES
    user_id=79902

    OK, I've fixed it by removing some cacheing (actually delegating the cacheing to the objects themselves.)

    However I still don't know why the fault was actually happening, or whether there are other nasty gotchas in a similar
    vein lurking in the code. Yuck...

     
  • miguel sofer
    miguel sofer
    2002-01-03

    • assigned_to: dkf --> msofer
    • status: closed-fixed --> open-fixed
     
  • miguel sofer
    miguel sofer
    2002-01-03

    Logged In: YES
    user_id=148712

    Reopening: a reminder that I still would like to check if
    the bug appears when [foreach] is bytecompiled.

     
  • miguel sofer
    miguel sofer
    2002-01-03

    Logged In: YES
    user_id=148712

    Closing again: Bytecoded [foreach] did not present the bug -
    the varName list(s) is stored in the bytecode as a sequence
    of local variables, the varValue list(s) is regenerated at
    each pass in INST_FOREACH_START4 by a call to
    Tcl_ListObjLength.

    I do not quite understand yet why the test script was not
    using the compiled [foreach], as the call is from within a
    proc body.

     
  • miguel sofer
    miguel sofer
    2002-01-03

    • assigned_to: msofer --> dkf
    • status: open-fixed --> closed-fixed
     
  • Logged In: YES
    user_id=79902

    The compiled foreach was probably not being used due to the use of a command trace.

     
    • assigned_to: dkf --> msofer
     
  • Logged In: NO

    Hi,

    Do you have a patch for this fix? We are using
    tcl 8.3.4 and need to get this fix (pressing
    Alt-key a few times triggers the crash!!!)

    Thanks,

    Yang Shyong