On Friday 23 March 2007 01:30, Nikodemus Siivola wrote:
> I have a proof of concept tree where GC happens outside the
> SIG_MEMORY_FAULT handler (working on x86-64 Linux), which is at least
> good enough to pass all tests...
> Here's what happens:
> 1. Signal handler calls
> signo, info, context);
> 2. This is like arrange_return_to_lisp_function, but instead of going
> to lisp we end up returning from the signal handler to call the
> function signal_tramp with (1) the real handler function, and (2 & 3)
> malloc'ed copies of siginfo and context.
> 3. signal_tramp blocks the signal we are handling in current thread,
Is there a window of time right before this when the signal is not
> saves copies of siginfo and context on stack and frees the malloced
> memory. It then calls the real handler with the signo, stack
> allocated siginfo and context, and the original signal mask. IF the
> return handler returns, the original signal mask is restored by
> signal_tramp: IF the real handler unwinds, then it is its
> responsibility to reset the sigmask.
> So, we have basically functional signal handlers outside the kernel,
> and are no longer restricted by silly POSIX rules about which
> functions are signal safe -- which we broke all the time. The only
> thing we cannot do is directly frob the context and return to it.
'Async signal safe' may actually be a slightly misleading misnomer. I
touched on this in "Some thread & signal safety issues". Functions that
are not async signal safe don't care if they are being reentered
directly from the signal handler or from a trampoline that the signal
handler arranged to be called at the very same point where the
I think this approach in general (i.e. considering all signals that not
just GC) has the same issues as the orignal that doesn't really fail
either with any frequency that approaches reproducability.
Now, for GC the situation is somewhat different. We know that GC is
triggerred synchronously by consing. Hence, we can be sure that no
async signal unsafe C code is running at that time in the thread where
the gc is triggerred and where we are going to run the GC code. In my
world that means we are safe.
In short, our synchronous signals are nothing to worry about because the
restrictions do no apply to them so as far as I know GC is fine as it
is and true async signals (think sigalarm, sigint) are not helped out
I've found this page about MPS that seems to share this view:
.threads.async: POSIX (and hence Linux) imposes some restrictions on
signal handler functions (see
design.mps.pthreadext.anal.signal.safety). Basically the rules say the
behaviour of almost all POSIX functions inside a signal handler
is undefined, except for a handful of functions which are known to be
"async-signal safe". However, if it's known that the signal didn't
happen inside a POSIX function, then it is safe to call arbitrary POSIX
functions inside a handler.
.threads.async.protection: If the signal handler is invoked because of
an MPS access, then we know the access must have been caused by client
code (because the client is not allowed to permit access to protectable
memory to arbitrary foreign code [need a reference for this]). In these
circumstances, it's OK to call arbitrary POSIX functions inside the
.threads.async.other: If the signal handler is invoked for some other
reason (i.e. one we are not prepared to handle) then there is less we
can say about what might have caused the SEGV. In general it is not
safe to call arbitrary POSIX functions inside the handler in this case.
> Todo: The initially malloced copies are just a quick hack: I am
> planning to actually copy the siginfo and context directly to stack,
> so (1) we don't need to worry if malloc is signal safe, and (2) we
> don't need to worry about leaking memory due to asynch unwinds.
> Todo: This is also still missing a bit of interrupt protection: we
> still have to worry about asynch unwinds that catch us after we have
> established the new signal mask.
> How does this sound? Am I forgetting something obvious? Does this
> approach make someone more uneady then running whatnot inside signal
> -- Nikodemus