[ This was first submitted to the Debian Bug Tracking
system, see http://bugs.debian.org/171353 for the
original report]
tkinter is the tk support module for python.
The setup is a Debian GNU/Linux system (unstable) with
glibc-2.3.1, tcl/tk-8.4.1 configured with
--enable-threads and python configured with
--enable-threads as well.
After the change to build python with tk8.4.1 instead
of tk8.3, many tkinter based applications crash. One of
them is
pydoc -g
same behaviour with python versions 2.1, 2.2
I don't include the stack trace here, it looks awfully
in the SF bug tracking ... please see
http://bugs.debian.org/171353
One of the python developers gave the following analysis:
The problem appears to be Tcl's use of thread-local
data. TkGetDisplay is implemented as
TkDisplay *
TkGetDisplay(display)
Display *display; /* X's display pointer */
{
TkDisplay *dispPtr;
ThreadSpecificData *tsdPtr = (ThreadSpecificData *)
Tcl_GetThreadData(&dataKey,
sizeof(ThreadSpecificData));
for (dispPtr = tsdPtr->displayList; dispPtr != NULL;
dispPtr = dispPtr->nextPtr) {
if (dispPtr->display == display) {
break;
}
}
return dispPtr;
}
If Tcl sees a call in a new thread "coming out of
nowhere", then Tcl_GetThreadData will find that there
are no thread-local data for this thread. It will
allocate an amount of memory and zero-initialize it.
Then, displayList for that thread will be NULL, and the
function will be NULL.
NULL is a documented return value for TkGetDisplay, but
apparently, Tk_FreeGC does not expect to get NULL, and
crashes.
I'm not sure how this is supposed to work. To me, the
notion of the TkDisplay being thread-local sounds
inherently broken: In this case, it means that you
can't free a GC in one thread that you have allocated
in another. So I would conclude this to be a bug in Tk.
Logged In: YES
user_id=72656
perhaps related to 614325?
Logged In: YES
user_id=21627
This is the same bug. The submitter said he was using the
API "incorrectly", i.e. from the wrong thread. However, I
believe being forced to use a display always from the same
thread defeats the purpose of using Tcl in multiple threads.
Logged In: YES
user_id=72656
I don't really buy that. Tcl is apartment model threading, and
there are specific APIs for queueing events from one thread
into another. This can be especially important when it comes
to the use of Tk (the UI). There is no guarantee that the
libraries that Tk is based on (X for example) are thread-safe
(in fact, X can very well not be), therefore restrictions have to
be moved higher up.
Logged In: YES
user_id=21627
Can you please point to the API for event queueing?
Suppose I do
.mybutton configure -text hallo
in a different thread, this will trigger this bug. It
appears to me that I cannot invoke *any* Tk commands safely
in a different thread. How am I supposed to use the
queueing API here?
Independent of this question, I think Tcl should not crash,
but report an error if it is used incorrectly.
Logged In: YES
user_id=72656
It is not possible to do
.mybutton configure -text hallo
in another thread, because that command only exists in the
one thread. This may be confused by the difference in
threading models between Python and Tcl. What you could
do in Tcl is create that same command in as many threads
as you want, but it would be an alias that just sent the
command to the thread with that command in it.
I do agree that this shouldn't crash, but if there is some more
subtle reason behind things not queueing correctly, it is
important to know.
Logged In: YES
user_id=21627
What do you mean by "the command only exists in a thread"?
AFAICT, the command exists in the interpreter, not in a
thread. We pass the command to Tcl_EvalObjv, and the
interpreter happily finds and executes the command; see the
stacktrace in the Debian bug report.
It is in general not possible to defer execution of the
command, since they command may return a result, which the
caller expects to receive immediately.
It is unfortunate that you believe that the libraries Tk is
based on might not be thread-safe. While this is a
theoretical issue, it does not matter in practice: On most
systems where both X and threads are available, X is
thread-safe. If X isn't thread-safe on a system, Tk should
not be compiled with thread support, since the thread
support is then useless, anyway.
It is fine if you restrict even processing (DoOneEvent and
friends) to the thread that has opened the display, since
real systems (e.g. Windows) have such a constraint. It might
also be necessary to defer TkButtonWorldChanged to the main
thread (although I can't see a need for that). However,
applications should be able to manipulate widgets from
multiple threads.
For Python and Tk, it has worked just fine that way for
years when Tcl did not support threads. Now if threads are
enabled, it is unfortunate that you can't use threads anymore.
Logged In: YES
user_id=21627
See bugs.debian.org/170711 for another instance of the
same problem; here, DeleteStressedCmap is the culprit, by
not testing whether the result of TkGetDisplay is NULL.
If this gets "fixed" by reporting a Tcl error instead of
executing the command, and giving no strategy how Python's
Tkinter (and other embedders) should deal with this issue, I
guess you will produce many unhappy users.
Logged In: NO
bug #614325 and #649209
Ok. I have found the fundamental problem(s) here.
Tk_FreeGC upon entry calls TkGetDisplay. TkGetDisplay
upon entry fetches it's ThreadSpecificData. It then tries
to search the list of main windows for the specified
display.
However, the problem in this case, is "deeper" down.
The problem is that the dataKey being relied upon by
TkGetDisplay is destroyed and recreated, destroying
along with it all the vital state information it held.
It is destroyed by Tcl_FinalizeThread, which calls
TclFinalizeThreadData. In threaded builds, this should
NOT be a problem, since thread local storage is
automatically keyed off of the thread in addition to the
keyPtr (slot number). However, for non-threaded
builds, the consequences are totally disastrous.
Another related problem that I ran into is that the
internal data structures for the "simulated" thread local
storage are not protected by any locking mechanism. I
realize that this is by design for non-threaded builds.
However, since Tcl's "thread-safety protocol" suggests
that you can always use Tcl interpreters from the thread
they are created on without mentioning any
other "gotchas", I think the code should be changed to
reflect that (via a process wide mutex, in this case).
The "solution" is either to force people to use threaded
builds or to re-implement the "simulated" thread local
storage to key off of the current thread id in addition to
the passed in "key" value.
JJM
Logged In: YES
user_id=25775
Um, the tkinter problem occurs (as it says) with a
thread-enabled build, so I don't think that "forc[ing]
people to use threaded builds" is much of a solution.
Logged In: NO
Is Tcl_FinalizeThread being called...? Is there more than
one interpreter being used per thread?
JJM
Logged In: YES
user_id=72656
OK, just reading through this again, and wanted to make a
few comments that anyone can address.
It should be possible to use the same display from multiple
threads that have Tk running.
If you have created a toplevel in a Tk thread, it should only be
accessed by the thread in which it was created. This is
because Tcl uses apartment model threading (like Perl). A
toplevel or widget that is created in one Tk thread only exists
in that thread.
Tkinter should not break the above truism without marshalling
into the right thread.
Tk shouldn't simply crash, even if called incorrectly. It may
panic, but with a meaningful value. Preferable is that "it just
works", or a Tcl-level error is returned.
There is one interpreter per thread in the basic Tcl model.
You can have multiple interps in a thread without problem, but
to have multiple threads operating in one interp requires
marshalling.
Python and Tcl's threading models are not immediately
compatible. Perl and Tcl's are near compatible.
JJM's issue is somewhat tangential, and not valid. He is
talking more about reuse of Tcl after finalization without
having reloaded the dlls. If you are not using a threaded Tcl
build, then calling Tcl_FinalizeThread may have similar effects
to Tcl_Finalize, but that is expected (because your build isn't
thread-enabled).
The building of Tkinter to use a threaded tcl/tk is news to me,
although I have advocated it before. I have worked through
problems with Guido and Tim before regarding the "standard"
build of Tkinter which is non-threaded 8.3 where python slices
time for Tcl's event handling. When I looked at that code
(p2.1), it was clear that a different model would be needed if a
threaded Tcl/Tk were used. Is that the case here?
Logged In: YES
user_id=72656
Let me also add that if you wanted it, you should be able to
have the object exist in any thread, but it should marshall
correctly to the thread in which it was created for actual work.
Logged In: YES
user_id=21627
It would be very desirable if Tcl supported free threading.
As it stands, multi-threaded Tcl just can't be used with Python.
Logged In: YES
user_id=79902
Why on earth should Tcl support free threading? It's a
source of bugs more often than not, and where it isn't
that's usually because the code is laden with vast numbers
of locks and you take a performance hit for each. We're
apartment threaded and not about to change; there's more
than enough other bugs and misfeatures to fix first... :^/
IMHO, what with the difference between Python and Tcl's
thread models, any failure by Tkinter to do the cross-thread
marshalling is a bug in Tkinter and not in either Tcl or
Tk. (A bug in Tcl or Tk would be if it is impossible to
create a Tkinter which does this because of something
horrible in the depths.) It's not like cross-thread
marshalling is all that hard to do, after all; build the Tcl
code to execute, dispatch it to the Tcl/Tk thread for
execution at a safe point, and wait for the results to come
back.
Logged In: YES
user_id=72656
This was corrected by Martin v Loewis on the Tkinter side by
correctly marshalling the necessary bits.
Logged In: YES
user_id=25775
There's still a minor bug -- it shouldn't crash if the API
was called incorrectly (as it was by tkinter). TkGetDisplay
can return NULL, but Tk_FreeGC doesn't expect it to.
Logged In: YES
user_id=72656
The Tk_FreeGC issue should be opened as a separate item
(to not clutter this report).