---------- Forwarded message ----------
From: Nils Bruin <nbruin@cecm.sfu.ca>
Date: Sat, Feb 26, 2011 at 3:34 AM
Subject: Re: [Ecls-list] Making an external signal handler play nice with embedded ECL
To: Juan Jose Garcia-Ripoll <juanjose.garciaripoll@googlemail.com>

Dear Juanjo,

Thank you! That is really useful feedback. The *current* way of running ECL code inside sage is by not enabling signals at all, so that's safe.
I did not quite understand the mprotect trick first but I think I do now. Let me verify:
 - If the ECL signal handler finds that the interrupted code is not "safe", then the handler stores the signal information and marks "env" read-only.
 - The next time "env" is accessed, a SIGSEGV will be triggered, which the ECL signal handler recognises as being triggered by the mprotected region.

Q: Is it guaranteed that this will happen in a "safe" area or is it possible that the signal handler discovers that it is still not safe and hence leaves the queue as is?

 - If safe, the handler now processes the queued signals and we're done.

I think I have a reasonably robust solution now, based on swapping the SIGINT and SIGSEGV handlers between ECL's own and the sage handlers.

[it's on http://trac.sagemath.org/sage_trac/ticket/10818 on the off chance you want to refer to the code/point other people to solutions]

It still leaves a small window where ECLs handlers are in place but ECLs environment isn't yet. If it would be possible to set the ECL signal handlers to "queue, don't handle" before entering ECL and enable them once the ECL environment is in place it might be possible to remove this one glitch.

Also, this solution would probably not work very well for lots of very small calls into ECL, due to the overhead of all the sigaction calls. I don't think this is much of an issue at the moment, but if it were I imagine the proper solution would be to make sage's signal handler ECL-aware and take appropriate action there (and hooks in the ECL API would help for that)

In fact, ECL does seem to have a small race condition itself (as most programs seem to have): If I run ECL (stand-alone) and truly bombard it with SIGINTs, I get a segfault.

With "bombarding" with SIGINTs I mean

  while true; do kill -INT $pid; done

python survives that test; sage doesn't. I think a lot of programs don't.

By the way, I originally intended to send this whole thread to the ECL mailing list. Feel free to forward or summarize there if you think it is of general interest.

Kind regards,

Nils Bruin

Instituto de Física Fundamental, CSIC
c/ Serrano, 113b, Madrid 28006 (Spain)