Re: [Sbcl-devel] WITH-TIMEOUT documentation

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

2009/10/14 Leslie P. Polzer <sk...@vi...>:

> I must admit that I don't really get this part of WITH-TIMEOUT's documentation:
...
> Usually there are parts of code that need to be protected against interruptions
> and others that are not. Why would one need to make everything uninterruptible?

Because interrupt safety doesn't compose. Even if A and B are
interrupt safe, it does not mean that function C calling _only_ A and
B is interrupt safe.

Example. First, let's write CALL-WITH-FOO correctly (make it interrupt
safe while remaining interruptible):

(defun call-with-foo (function)
  (let (foo)
    (without-interrupts
      (unwind-protect
	   (progn
	     ;; GET-FOO may block on a mutex: allow WITH-INTERRUPTS so
	     ;; the wait can be interrupted, while maintaining
	     ;; interrupt safety for the acquisition.
	     (setf foo (allow-with-interrupts (get-foo)))
	     ;; Enable interrupts now that we have FOO, since
	     ;; user-function may expect to be able to use them.
	     ;; Unwinding remains interupt-safe.
	     (with-local-interrupts (funcall function foo)))
	(when foo
	  (release-foo foo))))))

...so now CALL-WITH-FOO is good. ...but that is not the end of the
story: any caller of CALL-WITH-FOO still needs to worry about
interrupts:

(defun queue-result ()
  (let ((result (call-with-foo (lambda (foo) (pop-result foo)))))
    (enqueue result)
    result))

Assume that POP-RESULT, ENQUEUE, and all SBCL internals involved are
interrupt safe. QUEUE-RESULT isn't -- at least not if we assume a
destructive POP-RESULT.

Should an interrupt arrive and unwind us at any point between the pop
and the ENQUEUE call, we have lost the result and cannot ever recover
it.

So, you interrupt proof QUEUE-RESULT... and it's still not enough,
because its users need to be interrupt proofed as well.

Many people try to use WITH-TIMEOUT to unwind through unknown code
(eg. they stick it in their toplevel query handler to abort a query
that takes too long). For this to be safe every single function that
gets called needs to be interrupt safe -- and if your system has any
state, chances are that it isn't. Even if you go through the herculean
effort of interrupt proofing everything, it is terribly easy to break
it again. This is like thread-safety, but worse.

The only safe & sane place to use WITH-TIMEOUT is when you _know_ that
all code that runs
with interrupts enabled inside the WITH-TIMEOUT is interrupt-proofed,
and you handle the timeout condition locally so the asynch unwind
doesn't escape the lexical contour. I haven't run across one like this
in real code yet, though I'm certain such uses exists.

For the other cases (need to unwind through unknown code) you _must_
use synchronous timeouts. That is, functions which may take arbitrary
amounts of time must be written to accept a timeout / deadline
argument -- which you then check as appropriate. This is a lot more
work than just throwing WITH-TIMEOUT around a call to be sure! ...it
is however actually doable unlike interrupt proofing everything: you
only need to deal with blocking sites and potentially bad loops.

(end rant)

I'd really appreciate if someone knows how to make the WITH-TIMEOUT
documentation _scary_ enough that people don't use it to unwind
through random stuff. The current statement is both too broad and
doesn't really tell how to use it safely.

Cheers,

 -- Nikodemus

Re: [Sbcl-devel] WITH-TIMEOUT documentation

Common Lisp compiler and runtime

Re: [Sbcl-devel] WITH-TIMEOUT documentation