Mathew, let me try to explain again what is the problem and a possible operation model.

First of all a reminder of how I name things (which may not be compatible with usual standards :-). Compiled programs may receive "interrupts" or "signals" which may be originated by the operating system or by other situations. I see only these scenarios for interrupts:

1* Inter-process communication.
2* Inter-thread communication
3* Serious computation errors: SIGSEGV, SIGBUS
4* Not so serious errors: SIGFPE
5* Interruption of code as with Ctrl-C. I see these reasons
  6- Interrupt rroneous code with infinite loops
  7- Interrupt deadlocked code
  8- I/O operations that take too long
  9- Interrupt as a way to debug / inspect a thread
(Feel free to add more to the list)

The problematic begins when deciding how the program reacts to these interrupts. C libraries typically allow two ways of working:

i) the interrupt is delivered at any time, user code is stopped and an appropriate handler is executed. Since the interrupt may happen almost at any part of the code, the signal handler can only perform simple tasks that do not conflict with whatever was being done. In particular many resources (locks, files, etc) may be left in an inconsistent state during the signal.

ii) the program has a thread that waits for those interrupts. In this case it is like reading from a file a list of events. Things are safe and ok for handling, but not all interrupts can be waited for (see 3, 4 or the group 5)

Let us, as an exercise, assume that ECL runs with most interrupts disabled. In other words, the signal handlers in an ECL thread can only perform trivial tasks and we have an optional thread implementing what point (ii) above says.

The first two situations (1,2, typically implemented via INTERRUPT-PROCESS) can be eliminated or "enforced out". There are better ways to do inter-process communication than signals and most kind of such signals can be automatically translated into other communication means (SIGPIPE -> errno, user signal -> pipe message or socket...)

The situation 3 is serious and should be handled accordingly, for the affected thread may not continue to execute normally. Possible responses are
a) suspending the thread and opening a new thread with a debugger
b) jumping to an outer point of code (unsafe)
c) killing the thread
Out of these b) and c) are deemed unsafe but we are already in a muddy land when a SIGSEGV is delivered.

The case 4 can be handled similarly as 3 but we can complement with an additional option, "d) ignore floating point signals and continue", which is safe and ok.

Case 8 is ok. Getting an interrupt delivered during I/O operations is safe. We may enforce I/O operations to abort on receiving an interrupt even without using signal handlers. Calls to READ or PRINT will recognize that the I/O operation failed, look at the list of pending interrupts and invoke the appropriate error handlers.

Case 9 is also simple. One may reserve a signal to indicate thread suspension. In that case th signal handler is simple and just waits for a "resume" signal, allowing another thread to inspect its environment and gather information. This can be done in a POSIX-compatible way if the debugger does not want to "inject" or execute additional code in the suspended thread.

Cases 6 and 7 are more complicated. The problems with infinite loops and deadlocks (infinitely waiting mutexes), is that we would like be able to break the offending code (as with Ctrl-C) without quitting the lisp image. This means we need a way to stop a thread, typically forcing it to jump to an outer point. There are various ways to implement such a SIGINT handler
a* The SIGINT handler always jumps to an outer point in the lisp code.
b* Similar as "a" but only when the function is marked interruptible.
c* Similar as "a" but the thread is paused and in a separate thread a debugger is started, from which we can decide whether to jump to an outer point.
d* The SIGINT handler queues the interrupt until it is explicitly checked for.
Only the last alternative is POSIX-compliant, but it is very costly, because it forces us to add interrupt checks every now and then, as in GOTOs, and does not solve the problem of deadlocks.

So it seems it would be possible to execute ECL threads that run with interrupts mostly disabled. Signal handlers would do very little, and only in the undesirable situations would they allow jumping to outer parts of the code or canceling the thread (unwinding any possible operations), but that would be done placing the burden of possible side-effects on the user.

This would have a couple of positive side effects. One would be that it would make coding a lot simpler. Most of ECL right now is not async-signal-safe and it will probably never be. Lisp code also can't be async-safe. Instead of revisiting all the code, filling it with ecl_disable_interrupt() calls, which are costly, we would be able to get a cleaner Lispwhere everything is assumed to run properly, except in weird situations.

It might also help us in thinking of simpler ways to integrate ECL with foreign signal handlers, specially when embedding -- since ECL threads do not expect signals, or only serious ones with a specific protocol (unwind or exit) it would make embedders' lives easier.

Juanjo

--
Instituto de Física Fundamental, CSIC
c/ Serrano, 113b, Madrid 28006 (Spain)
http://juanjose.garciaripoll.googlepages.com