Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#4930 several thread-7.* tests leak

obsolete: 8.5.12
open
8
2012-07-16
2011-09-12
Don Porter
No

The following tests leak memory:

thread-7.24
thread-7.25
thread-7.26
thread-7.28
thread-7.29
thread-7.30
thread-7.31

The cause is the same in all of them. The tests make
use of the [testthread exit] command both directly and
indirectly via the [tcltest::threadReap] command. This
command unavoidably causes leaks. For this reason
the corresponding Thread command [thread::exit] is
declared deprecated.

I've successfully converted many tests from [testthread]
to the Thread package, making use of the [thread::preserve]
and [thread::release] commands to manage thread lifetimes
without leaking memory. I'm not able to convert the tests
above though.

These tests are testing Tcl_CancelEval via both the
[interp cancel] command and a [testthread cancel] testing
command. I've tried to convert these, but I cause
intermittent segfaults with my attempts. My hope is
that my failures are due to lack of experience with
cancel, and the matter will be easy for those who know it.

Discussion

1 2 > >> (Page 1 of 2)
  • Don Porter
    Don Porter
    2011-09-12

    • priority: 5 --> 8
     
  • Don Porter
    Don Porter
    2011-09-12

    BTW, the fact that I can trigger segfaults
    with scripts, even by accident, suggests
    there's some pretty serious bugs in either
    the *Cancel* machinery or the Thread package
    or both or their combination.

     
  • Don Porter
    Don Porter
    2011-11-17

    I've been working on converting the
    thread-7.* tests from using the [testthread]
    command to the Thread package. At the
    same time, I've been trying to eliminate
    the empirical use of [after] delays, and put
    in place whatever sync is needed to make
    the tests robust.

    I've got problems.

    First, I haven't been able to come
    up with a solution when the object
    is to test that some script is not
    cancelled. For example thread-7.18.

    Second, I've come to discover that when
    I removed some [after]s, even the passing
    tests are not actually demonstrating successul
    cancel of the code intended to get canceled.
    Often, the cancel is happening while the
    thread::send is still in progress.

    I need help with this.

     
  • Joe Mistachkin
    Joe Mistachkin
    2011-11-18

    Fixing these issues requires a couple more minor enhancements to the Thread package. I'm working on them now.

     
  • Joe Mistachkin
    Joe Mistachkin
    2011-11-18

    All of these tests have been fixed. Please verify and close this bug. If there are any additional issues, please let me know.

     
  • Joe Mistachkin
    Joe Mistachkin
    2011-11-18

    Also, note that you need the very latest version of the Thread package (currently the tip of the thread-2-7-branch branch) in order for all the tests to pass.

     
  • Joe Mistachkin
    Joe Mistachkin
    2011-11-18

    Careful analysis of ThreadEventProc in threadCmd.c reveals that in some cases it is neglecting to Tcl_Release the Tcl interp inside eventPtr->clbkData. I'm still trying to fully understand all possible paths through this function in my head.

     
  • Don Porter
    Don Porter
    2012-07-12

    • assigned_to: mistachkin --> ferrieux
    • milestone: 2101542 --> obsolete: 8.5.12
     
  • Don Porter
    Don Porter
    2012-07-12

    Here's the ticket where the troublesome tests have
    been worked on. Here's output testing the 8.6b3
    RC, all trimmed away except the interesting bits:

    ---- thread-7.26 start
    WARNING: drained 1 event(s) on main thread
    ++++ thread-7.26 PASSED
    ...
    ---- thread-7.28 start

    ==== thread-7.28 cancel: send async cancel nested catch inside pure bytecode loop FAILED
    ==== Contents of test case:

    unset -nocomplain ::threadSawError ::threadError ::threadId ::threadIdStarted
    set serverthread [thread::create -joinable [string map [list %ID% [thread::id]] {
    proc foobar {} {
    while {1} {
    if {![info exists foo]} then {
    # signal the primary thread that we are ready
    # to be canceled now (we are running).
    thread::send %ID% [list set ::threadIdStarted [thread::id]]
    set foo 1
    }
    catch {
    while {1} {
    catch {
    while {1} {
    # we must call update here because otherwise
    # the thread cannot even be forced to exit.
    update
    }
    }
    }
    }
    }
    }
    foobar
    }]]
    # wait for other thread to signal "ready to cancel"
    vwait ::threadIdStarted; after 1000
    set res [thread::send -async $serverthread {interp cancel}]
    thread::send $serverthread $::threadSuperKillScript
    vwait ::threadSawError($serverthread)
    thread::join $serverthread; drainEventQueue
    list $res [expr {[info exists ::threadIdStarted] ? $::threadIdStarted == $serverthread : 0}] [expr {[info exists ::threadId] ? $::threadId == $serverthread : 0}] [expr {[info exists ::threadError($serverthread)] ? [findThreadError $::threadError($serverthread)] : ""}]

    ---- Test generated error; Return code was: 1
    ---- Return code should have been one of: 0 2
    ---- errorInfo: target thread died
    while executing
    "thread::send $serverthread $::threadSuperKillScript"
    ("uplevel" body line 30)
    invoked from within
    "uplevel 1 $script"
    ---- errorCode: NONE
    ==== thread-7.28 FAILED

    ---- thread-7.29 start
    WARNING: drained 1 event(s) on main thread
    ++++ thread-7.29 PASSED
    ---- thread-7.30 start
    WARNING: drained 1 event(s) on main thread
    ++++ thread-7.30 PASSED
    ---- thread-7.31 start
    WARNING: drained 1 event(s) on main thread
    ++++ thread-7.31 PASSED
    ...
    ---- thread-7.34 start
    WARNING: drained 1 event(s) on main thread
    ++++ thread-7.34 PASSED
    ---- thread-7.35 start
    WARNING: drained 1 event(s) on main thread
    ++++ thread-7.35 PASSED
    ---- thread-7.36 start
    WARNING: drained 1 event(s) on main thread
    ++++ thread-7.36 PASSED
    ---- thread-7.37 start
    WARNING: drained 1 event(s) on main thread
    ++++ thread-7.37 PASSED

    The WARNINGs are reportedly evidence of pthread
    misbehavior. The single failure in this run is only one
    possible outcome. As few as zero and as many as 4
    test failures are common possibilities.

     
  • Can someone write a comment here, explaining what libpthread is supposed to be doing wrong ?

     
  • Joe Mistachkin
    Joe Mistachkin
    2012-07-12

    It's actually hard to pin down exactly what pthreads is doing incorrectly. If I remember correctly, it was "losing" events in some cases. There is also an issue with spurious events; however, I think that was related to the Tcl event loop processing code. Another question: Are the actual leaks originally reported in this bug gone now?

     
    • assigned_to: ferrieux --> vasiljevic
     
  • Sorry, I lack the bandwidth to reconsider the whole bug ecosystem. I am not a user of Tcl threads, my experience is with libpthread in other (non-Tcl) contexts, hence my offer to help. I'm still eager to help if the spotlight focuses on libpthread at some point, but reassigning to Zoran in the meantime.

     
  • Don Porter
    Don Porter
    2012-07-12

    • assigned_to: vasiljevic --> ferrieux
     
  • Don Porter
    Don Porter
    2012-07-12

    I should find time to run valgrind again to be certain,
    but my recollection is that thread.test is now leak free.

    It's noisy and failure-prone, but it doesn't leak memory.

     
    • assigned_to: ferrieux --> vasiljevic
     
  • Joe Mistachkin
    Joe Mistachkin
    2012-07-13

    These test failures do not appear to reproduce on core.tcl.tk, nor do they reproduce on Windows using the trunk of Tcl and the thread-2-7-branch branch of the Thread package.

     
  • Don Porter
    Don Porter
    2012-07-16

    • assigned_to: vasiljevic --> ferrieux
     
  • Don Porter
    Don Porter
    2012-07-16

    Thanks for checking. I will find time to see whether I
    can better demonstrate the issues, or figure out what
    the trouble is.

    If I cannot succeed in communicating to you the problems
    needing solving, then I'm inclined to just revert as much of
    thread.test as necessary to what "worked" before in the interest
    of getting 8.6b3 done.

     
    • assigned_to: ferrieux --> vasiljevic
     
  • Wondering why this gets repeatedly reassigned to me ... I'm handing it back to Zoran because he's (1) the obvious expert and (2) got some free cycles for Tcl these days (or so it seems).

     
  • Don Porter
    Don Porter
    2012-07-16

    Apologies ferrieux. That looks like a SF Tracker bug
    that I'm triggering. I keep this ticket open in a Browser
    tab, and although I get comment updates via some
    browser magic, my form data values remain the same,
    so whenever I submit a comment, my outdated values
    get posted too.

     
  • Stuart Cassoff
    Stuart Cassoff
    2012-07-16

    All tests pass on OpenBSD-current using the trunk of Tcl and the thread-2-7-branch branch of the Thread package.

     
  • Don Porter
    Don Porter
    2012-07-16

    Be sure to check before claiming "All tests pass".

    For at least some folks having no trouble with thread.test,
    it's not because "All tests pass" but because "All tests skipped".

    In particular, a working, [package require]-able Thread 2.7 has
    to be around for the troublesome tests to even be attempted.

     
  • Stuart Cassoff
    Stuart Cassoff
    2012-07-16

    Latest greatest Tcl and Thread 2.7 on OpenBsd-amd64-current:

    Tests began at Mon Jul 16 12:02:23 EDT 2012
    thread.test
    WARNING: drained 1 event(s) on main thread
    WARNING: drained 1 event(s) on main thread
    WARNING: drained 1 event(s) on main thread
    WARNING: drained 1 event(s) on main thread
    WARNING: drained 1 event(s) on main thread
    WARNING: drained 1 event(s) on main thread
    WARNING: drained 1 event(s) on main thread
    WARNING: drained 1 event(s) on main thread
    WARNING: drained 1 event(s) on main thread

    Tests ended at Mon Jul 16 12:02:38 EDT 2012
    all.tcl: Total 51 Passed 49 Skipped 2 Failed 0
    Sourced 1 Test Files.
    Number of tests skipped for each constraint:
    2 knownBug

     
1 2 > >> (Page 1 of 2)