#3938 Fix [chan create]/return dict key finalization order issues

obsolete: 8.5.1
open
6
2008-03-07
2008-02-29
No

% parray tcl_platform
tcl_platform(byteOrder) = littleEndian
tcl_platform(machine) = x86_64
tcl_platform(os) = Linux
tcl_platform(osVersion) = 2.6.22-14-generic
tcl_platform(platform) = unix
tcl_platform(pointerSize) = 8
tcl_platform(threaded) = 1
tcl_platform(user) = bschwarz
tcl_platform(wordSize) = 8

I have a file with this in it, and it produces a segfault:

et fid [chan create {r w} ::vchan]

chan puts $fid "HELLO THERE"
set data [read $fid]

If I add the following at the end, it does not segfault (note that I need a catch, because there is another problem when I [chan close]):

catch {chan close $fid}

I don't see the problem on "normal" channels (i.e. opening a file on the filesystem with [open])

Discussion

  • Brett Schwarz

    Brett Schwarz - 2008-02-29

    script that produces the problem

     
  • Brett Schwarz

    Brett Schwarz - 2008-02-29

    Logged In: YES
    user_id=159778
    Originator: YES

    File Added: tst_chan.tcl

     
  • Andreas Kupries

    Andreas Kupries - 2008-02-29

    Logged In: YES
    user_id=75003
    Originator: NO

    Information collected so far:
    - Happens for 64/32bit both
    - Happens for threaded/non-threaded both.

     
  • Brett Schwarz

    Brett Schwarz - 2008-02-29

    Logged In: YES
    user_id=159778
    Originator: YES

    the problem occurs on 32 bit Linux as well, and occurs on Windows

     
  • Brett Schwarz

    Brett Schwarz - 2008-03-03

    Logged In: YES
    user_id=159778
    Originator: YES

    I started tracking this down, and I got as far as this. In the FlushChannel function in tclIO.c, it craps out at:

    written = (chanPtr->typePtr->outputProc)(chanPtr->instanceData,
    RemovePoint(bufPtr), toWrite, &errorCode);

    Not sure how to track it from there, but I think I've tracked it pretty far...

     
  • Andreas Kupries

    Andreas Kupries - 2008-03-06

    Stacktrace of the crash

     
  • Andreas Kupries

    Andreas Kupries - 2008-03-06

    Logged In: YES
    user_id=75003
    Originator: NO

    Using a symbol enabled tclsh I got myself a stacktrace from gdb ... Attached ...

    What happens is that FlushChannel (#20) invokes the driver output function, one 'ReflectOuptut' (#19), which delegates the writing to the '::vchan' via 'InvokeTclMethod' (#18), this then ends up in a call to '::vchan::write', and then it seems to compile this procedure (ProcCompileProc @ #10). And when it compiles the 'return' command of 'vchan::write' it craps out in 'TclMergeReturnOptions' (#5). This function was last changed Oct 18, 2007. The caller, 'TclCompileReturnCmd' (#6) was however changed last quite recently, Feb 28 2008, to fix a memory leak. As the crash happens on a Tcl_Obj with refCount 0 and otherwise looking already freed as well, I wonder if that change caused the problem.

    I will now get me some older revisions of the head and see if I can bisect the issue.

    File Added: STACK.txt

     
  • Andreas Kupries

    Andreas Kupries - 2008-03-06
    • priority: 5 --> 9
     
  • Don Porter

    Don Porter - 2008-03-06

    Logged In: YES
    user_id=80530
    Originator: NO

    The stack trace shows the problem
    coming from MarshallError() which
    to me indicates that 1428575 may
    be the best way out.

     
  • Don Porter

    Don Porter - 2008-03-06

    Logged In: YES
    user_id=80530
    Originator: NO

    Apparently several of us testing
    with the demo script are seeing
    different failures. The one I see
    happens because Tcl_GetReturnOptions()
    is called after Tcl_FinalizeThread().

    The "fix" (workaround?) is to add
    keys[i] = NULL;
    at the right place in the ReleaseKeys()
    routine, so that in this scenario, the
    GetKeys() can re-initialize. This stops
    the crash, but adds a memleak, so I don't
    want to commit without more analysis.
    This is looking like a finalization order
    dependency mess to untangle.

    Another way out would be to just take
    the whole GetKeys() business out (perhaps
    conditioned on TclInExit() ?) and force
    the routines using it to created their
    key values on demand instead of pulling
    from a per-thread cache.

     
  • Andreas Kupries

    Andreas Kupries - 2008-03-06

    Logged In: YES
    user_id=75003
    Originator: NO

    I am now convinced that threaded and non-threaded are seeing two different expression of the same ordering problem. Tcl_FinalizeThread is run, this releases the dict keys for the return options.

    In the non-threaded case this crashed when Tcl tried to compile the 'return' command in 'vchan::write' in preparation of its invokation by the ReflectOutput. In the threaded case the compile managed to complete successfully and we made to the execution stage, i.e. the procedure was actually invoked. After that a script level error (expected return of int, got nothing) caused the generation of an error message by the reflection code and marshalling that error then crashed on the same missing return option keys.

    Don's 'fix' is a fix for non-threaded as well, I just did not see that immediately because my tclsh picked up the old library. 'make install' was needed after recompile.

    I agree that this fix introduces a mem leak if Tcl is used in an environment where the 'OS' has no true processes which are fully cleaned up. Example: IOS.

     
  • Andreas Kupries

    Andreas Kupries - 2008-03-07

    Logged In: YES
    user_id=75003
    Originator: NO

    The workaround has been committed, with proper warning that it is a workaround, introduces a memory leak, and a proper fix has to wade into the tangel that is finalization ordering.

    This entry is left open, at lesser priority, and updated summary.

     
  • Andreas Kupries

    Andreas Kupries - 2008-03-07
    • priority: 9 --> 6
    • summary: get segfault when using a channel created with [chan create] --> Fix [chan create]/return opt dict key final'ion order issues
     
  • Andreas Kupries

    Andreas Kupries - 2008-03-07
    • summary: Fix [chan create]/return opt dict key final'ion order issues --> Fix [chan create]/return dict key finalization order issues
     
  • Don Porter

    Don Porter - 2008-03-12

    Logged In: YES
    user_id=80530
    Originator: NO

    It seems that Tcl_GetReturnOptions() needs to
    be able to work on an interp argument over the
    entire lifetime of that interp. This means
    the cleanup of these key values needs to be
    triggered by a Tcl_InterpDeleteProc associated
    with the interp, and not by a Tcl_ExitProc
    thread exit handler as is currently done.

    It helps with performance to have multiple
    interps in one thread share keys, so a solution
    that still permits that would be best.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks