Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#1359 tkinter based applications crash

obsolete: 8.4.1
closed
Jeffrey Hobbs
None
5
2002-12-18
2002-12-05
Matthias Klose
No

[ This was first submitted to the Debian Bug Tracking
system, see http://bugs.debian.org/171353 for the
original report]

tkinter is the tk support module for python.

The setup is a Debian GNU/Linux system (unstable) with
glibc-2.3.1, tcl/tk-8.4.1 configured with
--enable-threads and python configured with
--enable-threads as well.

After the change to build python with tk8.4.1 instead
of tk8.3, many tkinter based applications crash. One of
them is

pydoc -g

same behaviour with python versions 2.1, 2.2

I don't include the stack trace here, it looks awfully
in the SF bug tracking ... please see
http://bugs.debian.org/171353

One of the python developers gave the following analysis:

The problem appears to be Tcl's use of thread-local
data. TkGetDisplay is implemented as

TkDisplay *
TkGetDisplay(display)
Display *display; /* X's display pointer */
{
TkDisplay *dispPtr;
ThreadSpecificData *tsdPtr = (ThreadSpecificData *)
Tcl_GetThreadData(&dataKey,
sizeof(ThreadSpecificData));

for (dispPtr = tsdPtr->displayList; dispPtr != NULL;
dispPtr = dispPtr->nextPtr) {
if (dispPtr->display == display) {
break;
}
}
return dispPtr;
}

If Tcl sees a call in a new thread "coming out of
nowhere", then Tcl_GetThreadData will find that there
are no thread-local data for this thread. It will
allocate an amount of memory and zero-initialize it.
Then, displayList for that thread will be NULL, and the
function will be NULL.

NULL is a documented return value for TkGetDisplay, but
apparently, Tk_FreeGC does not expect to get NULL, and
crashes.

I'm not sure how this is supposed to work. To me, the
notion of the TkDisplay being thread-local sounds
inherently broken: In this case, it means that you
can't free a GC in one thread that you have allocated
in another. So I would conclude this to be a bug in Tk.

Discussion

1 2 > >> (Page 1 of 2)
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2002-12-07

    • assigned_to: nobody --> hobbs
     
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2002-12-07

    Logged In: YES
    user_id=72656

    perhaps related to 614325?

     
  • Logged In: YES
    user_id=21627

    This is the same bug. The submitter said he was using the
    API "incorrectly", i.e. from the wrong thread. However, I
    believe being forced to use a display always from the same
    thread defeats the purpose of using Tcl in multiple threads.

     
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2002-12-07

    Logged In: YES
    user_id=72656

    I don't really buy that. Tcl is apartment model threading, and
    there are specific APIs for queueing events from one thread
    into another. This can be especially important when it comes
    to the use of Tk (the UI). There is no guarantee that the
    libraries that Tk is based on (X for example) are thread-safe
    (in fact, X can very well not be), therefore restrictions have to
    be moved higher up.

     
  • Logged In: YES
    user_id=21627

    Can you please point to the API for event queueing?

    Suppose I do

    .mybutton configure -text hallo

    in a different thread, this will trigger this bug. It
    appears to me that I cannot invoke *any* Tk commands safely
    in a different thread. How am I supposed to use the
    queueing API here?

    Independent of this question, I think Tcl should not crash,
    but report an error if it is used incorrectly.

     
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2002-12-08

    Logged In: YES
    user_id=72656

    It is not possible to do

    .mybutton configure -text hallo

    in another thread, because that command only exists in the
    one thread. This may be confused by the difference in
    threading models between Python and Tcl. What you could
    do in Tcl is create that same command in as many threads
    as you want, but it would be an alias that just sent the
    command to the thread with that command in it.

    I do agree that this shouldn't crash, but if there is some more
    subtle reason behind things not queueing correctly, it is
    important to know.

     
  • Logged In: YES
    user_id=21627

    What do you mean by "the command only exists in a thread"?
    AFAICT, the command exists in the interpreter, not in a
    thread. We pass the command to Tcl_EvalObjv, and the
    interpreter happily finds and executes the command; see the
    stacktrace in the Debian bug report.

    It is in general not possible to defer execution of the
    command, since they command may return a result, which the
    caller expects to receive immediately.

    It is unfortunate that you believe that the libraries Tk is
    based on might not be thread-safe. While this is a
    theoretical issue, it does not matter in practice: On most
    systems where both X and threads are available, X is
    thread-safe. If X isn't thread-safe on a system, Tk should
    not be compiled with thread support, since the thread
    support is then useless, anyway.

    It is fine if you restrict even processing (DoOneEvent and
    friends) to the thread that has opened the display, since
    real systems (e.g. Windows) have such a constraint. It might
    also be necessary to defer TkButtonWorldChanged to the main
    thread (although I can't see a need for that). However,
    applications should be able to manipulate widgets from
    multiple threads.

    For Python and Tk, it has worked just fine that way for
    years when Tcl did not support threads. Now if threads are
    enabled, it is unfortunate that you can't use threads anymore.

     
  • Logged In: YES
    user_id=21627

    See bugs.debian.org/170711 for another instance of the
    same problem; here, DeleteStressedCmap is the culprit, by
    not testing whether the result of TkGetDisplay is NULL.

    If this gets "fixed" by reporting a Tcl error instead of
    executing the command, and giving no strategy how Python's
    Tkinter (and other embedders) should deal with this issue, I
    guess you will produce many unhappy users.

     
  • Logged In: NO

    bug #614325 and #649209

    Ok. I have found the fundamental problem(s) here.

    Tk_FreeGC upon entry calls TkGetDisplay. TkGetDisplay
    upon entry fetches it's ThreadSpecificData. It then tries
    to search the list of main windows for the specified
    display.

    However, the problem in this case, is "deeper" down.
    The problem is that the dataKey being relied upon by
    TkGetDisplay is destroyed and recreated, destroying
    along with it all the vital state information it held.

    It is destroyed by Tcl_FinalizeThread, which calls
    TclFinalizeThreadData. In threaded builds, this should
    NOT be a problem, since thread local storage is
    automatically keyed off of the thread in addition to the
    keyPtr (slot number). However, for non-threaded
    builds, the consequences are totally disastrous.

    Another related problem that I ran into is that the
    internal data structures for the "simulated" thread local
    storage are not protected by any locking mechanism. I
    realize that this is by design for non-threaded builds.
    However, since Tcl's "thread-safety protocol" suggests
    that you can always use Tcl interpreters from the thread
    they are created on without mentioning any
    other "gotchas", I think the code should be changed to
    reflect that (via a process wide mutex, in this case).

    The "solution" is either to force people to use threaded
    builds or to re-implement the "simulated" thread local
    storage to key off of the current thread id in addition to
    the passed in "key" value.

    JJM

     
  • Chris Waters
    Chris Waters
    2002-12-09

    Logged In: YES
    user_id=25775

    Um, the tkinter problem occurs (as it says) with a
    thread-enabled build, so I don't think that "forc[ing]
    people to use threaded builds" is much of a solution.

     
1 2 > >> (Page 1 of 2)