Menu

#1359 tkinter based applications crash

obsolete: 8.4.1
closed
None
5
2002-12-18
2002-12-05
No

[ This was first submitted to the Debian Bug Tracking
system, see http://bugs.debian.org/171353 for the
original report]

tkinter is the tk support module for python.

The setup is a Debian GNU/Linux system (unstable) with
glibc-2.3.1, tcl/tk-8.4.1 configured with
--enable-threads and python configured with
--enable-threads as well.

After the change to build python with tk8.4.1 instead
of tk8.3, many tkinter based applications crash. One of
them is

pydoc -g

same behaviour with python versions 2.1, 2.2

I don't include the stack trace here, it looks awfully
in the SF bug tracking ... please see
http://bugs.debian.org/171353

One of the python developers gave the following analysis:

The problem appears to be Tcl's use of thread-local
data. TkGetDisplay is implemented as

TkDisplay *
TkGetDisplay(display)
Display *display; /* X's display pointer */
{
TkDisplay *dispPtr;
ThreadSpecificData *tsdPtr = (ThreadSpecificData *)
Tcl_GetThreadData(&dataKey,
sizeof(ThreadSpecificData));

for (dispPtr = tsdPtr->displayList; dispPtr != NULL;
dispPtr = dispPtr->nextPtr) {
if (dispPtr->display == display) {
break;
}
}
return dispPtr;
}

If Tcl sees a call in a new thread "coming out of
nowhere", then Tcl_GetThreadData will find that there
are no thread-local data for this thread. It will
allocate an amount of memory and zero-initialize it.
Then, displayList for that thread will be NULL, and the
function will be NULL.

NULL is a documented return value for TkGetDisplay, but
apparently, Tk_FreeGC does not expect to get NULL, and
crashes.

I'm not sure how this is supposed to work. To me, the
notion of the TkDisplay being thread-local sounds
inherently broken: In this case, it means that you
can't free a GC in one thread that you have allocated
in another. So I would conclude this to be a bug in Tk.

Discussion

  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-07
    • assigned_to: nobody --> hobbs
     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-07

    Logged In: YES
    user_id=72656

    perhaps related to 614325?

     
  • Martin v. Löwis

    Logged In: YES
    user_id=21627

    This is the same bug. The submitter said he was using the
    API "incorrectly", i.e. from the wrong thread. However, I
    believe being forced to use a display always from the same
    thread defeats the purpose of using Tcl in multiple threads.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-07

    Logged In: YES
    user_id=72656

    I don't really buy that. Tcl is apartment model threading, and
    there are specific APIs for queueing events from one thread
    into another. This can be especially important when it comes
    to the use of Tk (the UI). There is no guarantee that the
    libraries that Tk is based on (X for example) are thread-safe
    (in fact, X can very well not be), therefore restrictions have to
    be moved higher up.

     
  • Martin v. Löwis

    Logged In: YES
    user_id=21627

    Can you please point to the API for event queueing?

    Suppose I do

    .mybutton configure -text hallo

    in a different thread, this will trigger this bug. It
    appears to me that I cannot invoke *any* Tk commands safely
    in a different thread. How am I supposed to use the
    queueing API here?

    Independent of this question, I think Tcl should not crash,
    but report an error if it is used incorrectly.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-08

    Logged In: YES
    user_id=72656

    It is not possible to do

    .mybutton configure -text hallo

    in another thread, because that command only exists in the
    one thread. This may be confused by the difference in
    threading models between Python and Tcl. What you could
    do in Tcl is create that same command in as many threads
    as you want, but it would be an alias that just sent the
    command to the thread with that command in it.

    I do agree that this shouldn't crash, but if there is some more
    subtle reason behind things not queueing correctly, it is
    important to know.

     
  • Martin v. Löwis

    Logged In: YES
    user_id=21627

    What do you mean by "the command only exists in a thread"?
    AFAICT, the command exists in the interpreter, not in a
    thread. We pass the command to Tcl_EvalObjv, and the
    interpreter happily finds and executes the command; see the
    stacktrace in the Debian bug report.

    It is in general not possible to defer execution of the
    command, since they command may return a result, which the
    caller expects to receive immediately.

    It is unfortunate that you believe that the libraries Tk is
    based on might not be thread-safe. While this is a
    theoretical issue, it does not matter in practice: On most
    systems where both X and threads are available, X is
    thread-safe. If X isn't thread-safe on a system, Tk should
    not be compiled with thread support, since the thread
    support is then useless, anyway.

    It is fine if you restrict even processing (DoOneEvent and
    friends) to the thread that has opened the display, since
    real systems (e.g. Windows) have such a constraint. It might
    also be necessary to defer TkButtonWorldChanged to the main
    thread (although I can't see a need for that). However,
    applications should be able to manipulate widgets from
    multiple threads.

    For Python and Tk, it has worked just fine that way for
    years when Tcl did not support threads. Now if threads are
    enabled, it is unfortunate that you can't use threads anymore.

     
  • Martin v. Löwis

    Logged In: YES
    user_id=21627

    See bugs.debian.org/170711 for another instance of the
    same problem; here, DeleteStressedCmap is the culprit, by
    not testing whether the result of TkGetDisplay is NULL.

    If this gets "fixed" by reporting a Tcl error instead of
    executing the command, and giving no strategy how Python's
    Tkinter (and other embedders) should deal with this issue, I
    guess you will produce many unhappy users.

     
  • Nobody/Anonymous

    Logged In: NO

    bug #614325 and #649209

    Ok. I have found the fundamental problem(s) here.

    Tk_FreeGC upon entry calls TkGetDisplay. TkGetDisplay
    upon entry fetches it's ThreadSpecificData. It then tries
    to search the list of main windows for the specified
    display.

    However, the problem in this case, is "deeper" down.
    The problem is that the dataKey being relied upon by
    TkGetDisplay is destroyed and recreated, destroying
    along with it all the vital state information it held.

    It is destroyed by Tcl_FinalizeThread, which calls
    TclFinalizeThreadData. In threaded builds, this should
    NOT be a problem, since thread local storage is
    automatically keyed off of the thread in addition to the
    keyPtr (slot number). However, for non-threaded
    builds, the consequences are totally disastrous.

    Another related problem that I ran into is that the
    internal data structures for the "simulated" thread local
    storage are not protected by any locking mechanism. I
    realize that this is by design for non-threaded builds.
    However, since Tcl's "thread-safety protocol" suggests
    that you can always use Tcl interpreters from the thread
    they are created on without mentioning any
    other "gotchas", I think the code should be changed to
    reflect that (via a process wide mutex, in this case).

    The "solution" is either to force people to use threaded
    builds or to re-implement the "simulated" thread local
    storage to key off of the current thread id in addition to
    the passed in "key" value.

    JJM

     
  • Chris Waters

    Chris Waters - 2002-12-09

    Logged In: YES
    user_id=25775

    Um, the tkinter problem occurs (as it says) with a
    thread-enabled build, so I don't think that "forc[ing]
    people to use threaded builds" is much of a solution.

     
  • Nobody/Anonymous

    Logged In: NO

    Is Tcl_FinalizeThread being called...? Is there more than
    one interpreter being used per thread?

    JJM

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-10

    Logged In: YES
    user_id=72656

    OK, just reading through this again, and wanted to make a
    few comments that anyone can address.

    It should be possible to use the same display from multiple
    threads that have Tk running.

    If you have created a toplevel in a Tk thread, it should only be
    accessed by the thread in which it was created. This is
    because Tcl uses apartment model threading (like Perl). A
    toplevel or widget that is created in one Tk thread only exists
    in that thread.

    Tkinter should not break the above truism without marshalling
    into the right thread.

    Tk shouldn't simply crash, even if called incorrectly. It may
    panic, but with a meaningful value. Preferable is that "it just
    works", or a Tcl-level error is returned.

    There is one interpreter per thread in the basic Tcl model.
    You can have multiple interps in a thread without problem, but
    to have multiple threads operating in one interp requires
    marshalling.

    Python and Tcl's threading models are not immediately
    compatible. Perl and Tcl's are near compatible.

    JJM's issue is somewhat tangential, and not valid. He is
    talking more about reuse of Tcl after finalization without
    having reloaded the dlls. If you are not using a threaded Tcl
    build, then calling Tcl_FinalizeThread may have similar effects
    to Tcl_Finalize, but that is expected (because your build isn't
    thread-enabled).

    The building of Tkinter to use a threaded tcl/tk is news to me,
    although I have advocated it before. I have worked through
    problems with Guido and Tim before regarding the "standard"
    build of Tkinter which is non-threaded 8.3 where python slices
    time for Tcl's event handling. When I looked at that code
    (p2.1), it was clear that a different model would be needed if a
    threaded Tcl/Tk were used. Is that the case here?

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-10

    Logged In: YES
    user_id=72656

    Let me also add that if you wanted it, you should be able to
    have the object exist in any thread, but it should marshall
    correctly to the thread in which it was created for actual work.

     
  • Martin v. Löwis

    Logged In: YES
    user_id=21627

    It would be very desirable if Tcl supported free threading.
    As it stands, multi-threaded Tcl just can't be used with Python.

     
  • Donal K. Fellows

    Logged In: YES
    user_id=79902

    Why on earth should Tcl support free threading? It's a
    source of bugs more often than not, and where it isn't
    that's usually because the code is laden with vast numbers
    of locks and you take a performance hit for each. We're
    apartment threaded and not about to change; there's more
    than enough other bugs and misfeatures to fix first... :^/

    IMHO, what with the difference between Python and Tcl's
    thread models, any failure by Tkinter to do the cross-thread
    marshalling is a bug in Tkinter and not in either Tcl or
    Tk. (A bug in Tcl or Tk would be if it is impossible to
    create a Tkinter which does this because of something
    horrible in the depths.) It's not like cross-thread
    marshalling is all that hard to do, after all; build the Tcl
    code to execute, dispatch it to the Tcl/Tk thread for
    execution at a safe point, and wait for the results to come
    back.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-18

    Logged In: YES
    user_id=72656

    This was corrected by Martin v Loewis on the Tkinter side by
    correctly marshalling the necessary bits.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-18
    • status: open --> closed
     
  • Chris Waters

    Chris Waters - 2002-12-18

    Logged In: YES
    user_id=25775

    There's still a minor bug -- it shouldn't crash if the API
    was called incorrectly (as it was by tkinter). TkGetDisplay
    can return NULL, but Tk_FreeGC doesn't expect it to.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2002-12-18

    Logged In: YES
    user_id=72656

    The Tk_FreeGC issue should be opened as a separate item
    (to not clutter this report).