David Lichteblau <david@...> writes:
> Quoting Kouskoulas, Yanni A. (Yanni.Kouskoulas@...):
>> Anton, are you happy with the with the windows solution you have
>> implemented? Is it robust/efficient in your estimation, or are there
>> things about it that you would change? Is there a consensus among the
>> community that this is the right solution/approach?
> I also have various questions regarding this area, and I'm hoping for your
> thoughts on the proper design. :-) May I summarize all those many questions
> rather succinctly as:
> Why is the callback trampoline using fibers?
Well, I tried to explain that, but maybe there were too many noise in my
previous attempt. Briefly:
1. It's much easier to implement foreign thread callbacks when you
switch stacks between Lisp and non-Lisp contexts.
2. User code switching the stack is close to absolute no-no on Windows.
3. So my only option (after a decision on #1.) was to make OS switch
stacks for me, and hope it does so in a manner compatible with itself.
Fibers are just for that.
If the same design is to be implemented on POSIX, makecontext and
swapcontext are very much like a drop-in replacement for fibers, so it's
only the specific OS interface which is unportable, not the facility
There could be some other solution that won't switch stacks, and I'd be
interested to see it. However, stack-switching has an advantage, even if
we leave out implementation simplicity: we can be sure that we get a
reasonable amount of stack space for Lisp code. A real-life example is
the SetConsoleCtrlHandler mentioned before: somewhere (perhaps in MSDN,
but I'm not sure) we are warned to expect something like 4K stack space
when the handler is called.
So when I say that I'd be interesting to see a non-switching solution,
it doesn't mean that I'll automatically regard it as a better one. Stack
size is important enough that I'm now considering using fibers for
/stack overflow handlers/ as well.
As of performance implications: scheduling / switching fibers is totally
userspace, and actually works fast (swapcontext is known to be slower on
32-bit x86, 'cause fiber-switching code doesn't have to touch FPU
context there, but I'm unsure whether it will be noticeable difference
on x64). With that "page protection as GC synchronization device" trick,
there are no atomic operations on the fast path. Sorting out foreign
thread callbacks from normal callbacks is cheap as well.
I never used foreign thread callbacks in a performance-sensitive
contexts (and it's not easy to come up with a realistic example that
would require something faster than "much faster than thread context
switch", which we have anyway), but I don't expect any noticeable
slowdown compared to normal alien callbacks.
As of things I would change: exactly /nothing/ when it comes to
high-level ideas and OS facilities. There is some code that I could
write in a cleaner way, and some internal (coding) decisions that I
could reconsider (e.g. not folding fiber support into pthread layer
could make POSIX port easier. OTOH, the whole thing would became more
complex). But I'm sure I would use fibers and represent them in Lisp
exactly in the manner I did.
..Robustness. If it means robustness as in "some obscure usage of
threads in some library doesn't suddenly interferes with our use of
fibers", than I can say I've never seen any problem like this (it was
actually a good surprise: I expected some problems here). If it means
robustness as "the code is so clean that it's obviously correct", then
it could be better. Historically, there were several obscure bugs in my
additions to pthread layer, as well as in its uses, but only one of them
was specific to fibers (and, consequently, foreign callbacks), IIRC; all
others were affecting both threads and fibers equally.
As of the assumption that GC moving things is a big problem: not at
all. SBCL has a brute force solution to a callback problem in general:
it allocates trampoline code in static space and never moves it. If we
wanted to be clever here, it's possible to make callbacks disposable or
moveable on save-lisp-and-die. With the current implementation, however,
callback trampolines are never moved by GC, that's all.
As soon as the Lisp code is entered, GC synchronization becomes a
problem, but it's not a /new/ problem: GC should stop the world before
it runs, and the world includes our used-to-be-foreign thread at this
The only new thing here is that a thread becomes or ceases to be a part
of the "lisp world" several times, maybe very frequently, and it should
be fast. Handling this is about as tricky before safepoints are
available as it becomes trivial after safepoints are ported. With the
traditional way of stopping the world, it's hard to avoid many expensive
things each time a thread enters and leaves that world.
To run multithreaded SBCL on OS that has no asynchronous signals at all,
we had to solve the very same problem for regular foreign callouts: any
thread stuck within ReadFile effectively leaves the "world" that should
be stopped during GC. And it turned out that once it's solved for
callouts, the solution /just works/ for the other side, i.e. threads
that are yet/already outside a foreign callback, and "lispy" fibers that
are descheduled (conceptually, a descheduled fiber is running a looong
foreign callout to SwitchToFiber). So there is no new problem in foreign
callbacks here, if only safepoints are ported, stop-the-world is done
without asynchronous signals, and enter/leave synchronization is made
inexpensive on fast path.
Regards, Anton Kovalenko <http://www.siftsoft.com/support-sbcl-windows.html>
+7(916)345-34-02 | Elektrostal' MO, Russia