|
From: Bruno H. <br...@cl...> - 2017-12-07 23:17:23
|
Hi all,
In 2008-2012, I couldn't give much input or feedback regarding the
multithreading implementation. But now, here are 3 ideas for improvement
that I collected over the last few months.
Vladimir: I appreciate a lot your hard work in this area. Especially the
GC and signal handling changes are among the most difficult things any hacker
can tackle.
----------------------------------------------------------------------
1) The philosophy of clisp and the philosophy of multithread support.
I find that it is a must that:
Simultaneous access to the same object from different threads must,
by default, lead to an error instead of a crash.
Why?
* This is the basic philosophy of clisp: Simple programming mistakes
lead to errors, not crashes. (And with decent error messages, in most
cases.)
While some implementations that compile to native code have default
optimization settings that will allow a 'dotimes' loop, for example,
to crash if the number is a bignum, clisp's philosophy implies that
it does not optimize away type checks if that could lead to a crash.
The point of keeping the type information on *all* objects, at runtime,
is to be able to diagnose a type error in *all* situations. This is
a distinguishing feature of clisp.
* The philosophy of C, on the opposite side, is: Simple programming
mistakes lead to crashes. This is the consequence of maximum optimization.
When you program in C, you know that every time you make a small mistake,
you risk a crash.
* [1]: "If you want to shoot yourself, it is your responsibility to wear armor."
This means that the philosophy of the multithreading support currently
is the C philosophy.
* I wish to have multithreading support with the clisp philosophy, not the
C philosophy.
How to implement it?
*Every object access* will have to be protected by taking a lock.
For example,
LISPFUNNR(car,1)
{
VALUES1(car(popSTACK()));
}
will be rewritten to
LISPFUNNR(car,1)
{
var object argument = popSTACK();
RDLOCK(object);
var object result = car(argument);
RDUNLOCK(object);
VALUES1(result);
}
RDLOCK(object) may signal an error:
"The object ~S is currently locked by thread ~S."
To this effect, *every object* will have, in its header, information
about which thread is currently holding a write-lock on the object, or
how many objects are holding a read-lock on the object. I believe this
can be done through a 32-bits field. (This field contains less information
than a "mutex" or "lock" in usual OS programming. Here RDLOCK will never
wait: it will either signal an error or proceed. It will never put the
current process in a "wait queue".)
There is literature that explains how to implement efficient locking,
e.g. "biased locking" [2].
----------------------------------------------------------------------
2) The set of target platforms.
At the lowest level, where clisp uses C functions from libc, a multithread
enabled clisp must only use multithread-safe function from libc. For
example, readdir_r instead of readdir. Except in cases where we can
guarantee that it's OK.
I don't know what the status on this task is, but I expect that this will
severely limit the set of target platforms for multithreading to glibc and
few other systems.
----------------------------------------------------------------------
3) Making use of standardized facilities
Since Vladimir wrote the multithreading support, standards have caught
up:
* The '__thread' storage class is now also supported on Windows [3].
* <stdatomic.h> (or std::atomic in C++) provides portable support for
atomics.
I expect these facilities to be better optimized than what we could roll
on our own. Therefore it will be interesting to see to which extent we can
make use of them.
This too will limit the set of target platforms. In the end, I expect
that only modern glibc, Windows, and macOS will be left as target platforms
for multithreading.
----------------------------------------------------------------------
All this is food for the future. No hurry.
Please give yourself 24 hours of thinking before replying. [4]
Bruno
[1] https://clisp.sourceforge.io/impnotes/mt.html#mt-mutable
[2] https://www.cs.princeton.edu/picasso/mats/HotspotOverview.pdf
[2] https://msdn.microsoft.com/en-us/library/6yh4a9k1.aspx
[4] http://phk.freebsd.dk/sagas/bikeshed.html
|