Thread-safe operations
Brought to you by:
msobczak
This patch solves a couple of problems I found in the
library:
- The library is not thread-safe. Callback/class/policy
maps are global variables;
- Objects accessible from Tcl are not destroyed
automatically when object command or interpreter is
being deleted.
The patch adds:
- Per-thread callback maps;
- Tracking of all Tcl accessible objects and automatic
cleaning up even if the application is terminated by
'exit' command;
- Automatic destruction of an object when an object
command (pXXX) is deleted.
It also fixed several compiler warnings produced by
Visual C++ 2005 compiler.
The other my patch (ID 1403360) is incorporated into
this one.
The patch
Logged In: YES
user_id=1136421
Thank you very much for this patch. Parts of it were
included in the latest 1.1.2 release of the library.
There are, however, two things which were not used, even
though they were the main motivation of this patch:
1. The per-thread maps were not added, because this does not
solve multithreading problems. Imagine a program where one
thread prepares the interpreters for later work (by
registering new commands) and that separate threads run
scripts that actually use those commands - with per-thread
maps those "worker" threads would not see the commands that
were registered earlier, because the threads would refer to
different maps than those which were used for command
registration. Note that the maps are already addresses by
the interpreter object. The correct solution for
multithreading (if there is any) would require
synchronization rather than separation of shared data. The
synchronization (preferably using Boost primitives) might be
added as an option in later versions of the library.
2. Cleaning up objects when their associated command is
deleted was not added, because this would interfere with the
concept of "sink" policies. There are functions which take
objects as parameter and take care of its further
management, at the same time deleting the command that is
associated with the given object. If, during the command
removal, the objects themselves were automatically destroyed
by some mechanizm that is not aware of this special setting,
then the "sink" functions themselves would be surprised by
later relating to objects that got destroyed. Another
problem would arise in the embedded scenario (ie. Tcl
interpreter within a C++ master program), where some C++
objects can in fact live longer than the Tcl interpreter and
any command in it - then, it would be similarly a bad idea
to automatically clean up objects which actually do not need
to be cleaned up at all, at least as far as Tcl script is
concerned.
Thank you very much for this patch, it proved that there is
an interest in how the library was structured and implemented.
Looking forward for further collaboration,
Maciej Sobczak
Logged In: YES
user_id=1364061
> Imagine a program where one thread prepares the
> interpreters for later work (by registering new commands)
> and that separate threads run scripts that actually use
> those commands
Tcl restricts this scenario because of "at most one thread
per interpreter" concept. One has to register a command in
each interpreter so having a single registration for all
threads does not work. On the other hand it is possible to
call the same registration routine for each thread.
The problem that I see is that now it is not possible to
pass a reference to an object to another thread even if the
class is registered in both threads.
> If, during the command removal, the objects themselves
> were automatically destroyed by some mechanizm that is
> not aware of this special setting, then the "sink"
> functions themselves would be surprised by later
> relating to objects that got destroyed.
Hmm. Calling Tcl_DeleteCommand is not as bad as converting
'pXXX' to a raw pointer and using it. I would say that this
issue and the problem with passing an object reference to
another thread requires some kind of object reference
counting mechanism. What do you think?
> objects can in fact live longer than the Tcl interpreter
> and any command in it - then, it would be similarly
> a bad idea to automatically clean up objects which
> actually do not need to be cleaned up at all, at
> least as far as Tcl script is concerned.
I really believe that automatic clean up support is must.
But it can be implemented differently. :-) Let's say the
reference counting mentioned above can help to solve the
problem you described along with providing valid cleaning
up.
We can continue this dicussion by e-mail. It is much more
comfortable. :-)
Logged In: YES
user_id=1136421
(I continue on this forum, since there are people who might
be interested in this discussion or even join it.)
Considering the threading, the "at most one thread per
interpreter" concept does not prevent many threads to
operate on the same interpreter, one after another. This is
what I meant when describing the scenario with one thread
preparing commands (for *each* interpreter) and later, other
thread running scripts that use those commands. In this
scenario, there are many interpreters working on each
interperter (one preparing and one or more working), so it
would be nice if one thread can see commands registered by
another. The TLS singletons would make it unnecessarily
complicated.
And considering ref-counting of objects that have new
commands attached, this is still not satisfactory, because
it assumes that all such objects were allocated on the free
store, and would even impose some particular ref-counting
scheme (even if it's boost::shared_ptr). The problem is that
the object (or rather a pointer to the object) returned by
pointer might not be on the free store at all, it can be as
well automatic or even a static one that exists in the C++
code and is exposed to Tcl (as an example, consider the
Singleton pattern). Solving this would require some
additional policy that would help distinguish between those
objects that really need memory management (ref-counted or
other) from those that don't.
Regards,
Maciej Sobczak
Logged In: YES
user_id=1364061
> Considering the threading, the "at most one thread per
> interpreter" concept does not prevent many threads to
> operate on the same interpreter, one after another.
I think it does. I guess it might be possible to create
an interpreter in the thread A; register commands in this
interpreter in the thread B and then execute the registered
commands again in the thread A. But I can see several
reasons why one shouldn’t do it:
- “This is not a Tcl way”. I've got such impression while
reading these two pages:
http://www.tcl.tk/doc/howto/thread_model.html
and http://wiki.tcl.tk/1339. Besides the architecture
issues it means that Tcl authors can change Tcl
implementationin the future so that the trick with using
the same interpreter from different threads will not
work.
- I'm not sure that this trick is possible even now.
I didn't check this thorougly but I search through Tcl
sources for examples of storing data in TLS. There are
31 .c files in Tcl distribution mentioning
'Tcl_ThreadDataKey'. Again this doesn't prove anything
but it shows that the code is not intended to be called
from different threads even if the calls are in sync.
> scenario, there are many interpreters working on each
> interperter (one preparing and one or more working), so it
> would be nice if one thread can see commands registered by
> another.
Is “outsourcing” of commands registration to other thread
better than calling the same registration routine for each
interpreter in its own thread? Can you give a more specific
example?
> And considering ref-counting of objects that have new
> commands attached, this is still not satisfactory, because
Yes, you are right. The picture I had in my mind was
something like each interpreter having own a reference to
an object. But it will not work because we may not always
do assomptions about object allocation/destruction policy.
So I thought about it for a while and came to conclusion
that the main problem is passing an object to a different
thread. In order to be done it requires:
- first, a command should be registered;
- second, we have to be sure that a reference is still
valid when the second thread is dereferencing it;
- third, and this is a bit different story, exposed
methods should take care of synchronization.
I continued thinking and I thought “why do we need to pass
object references between Tcl threads at all”? Tcl support
sending textual messages from one thread to another and
nothing else (it may be not 100% true, correct me if I'm
wrong). So if Tcl is not keen in doing it it should be done
in C++. In other words Tcl threads will exchange object
handles while C++ part will take care about passing actual
object.
To be more formal it will be done in the following way:
- Objects will be local and will be recognized only by
the interpreter having object command registered;
- Each thread will have a list of handles - one for each
exposed object;
- An object handle will be determine whether object
allocation/destruction is controlled by cpptcl;
- Sinks, constructors and automatic cleanup routines will
use object handles;
- New destructor policy may be needed. It will destroy
objects created by factory methods;
- Passing objects between interpreters will be a C++
developer's responsibility. It can be something like
a global map of objects identified by a key and only
the key will be passed by means of Tcl.
Pros:
- No synchronization required - each interpreter will
have own copy of data. It is a bit expensive but
overhead will be small enough;
- Automatic object clean up;
- Sink and factory policies will work properly;
Cons:
- There is no standard way for passing object between
threads. Propably, a set of helpers should be written
to cover this area.
Best regards,
Alexey Pakhunov.
Logged In: YES
user_id=1364061
> Tcl supports sending textual messages from one thread
> to another and nothing else (it may be not 100% true,
> correct me if I'm wrong).
It seems that thread 2.6 supports more than that. I need
to make a closer look ...