From: Nikodemus S. <nik...@ra...> - 2009-10-22 15:47:30
|
2009/10/14 Leslie P. Polzer <sk...@vi...>: > I must admit that I don't really get this part of WITH-TIMEOUT's documentation: ... > Usually there are parts of code that need to be protected against interruptions > and others that are not. Why would one need to make everything uninterruptible? Because interrupt safety doesn't compose. Even if A and B are interrupt safe, it does not mean that function C calling _only_ A and B is interrupt safe. Example. First, let's write CALL-WITH-FOO correctly (make it interrupt safe while remaining interruptible): (defun call-with-foo (function) (let (foo) (without-interrupts (unwind-protect (progn ;; GET-FOO may block on a mutex: allow WITH-INTERRUPTS so ;; the wait can be interrupted, while maintaining ;; interrupt safety for the acquisition. (setf foo (allow-with-interrupts (get-foo))) ;; Enable interrupts now that we have FOO, since ;; user-function may expect to be able to use them. ;; Unwinding remains interupt-safe. (with-local-interrupts (funcall function foo))) (when foo (release-foo foo)))))) ...so now CALL-WITH-FOO is good. ...but that is not the end of the story: any caller of CALL-WITH-FOO still needs to worry about interrupts: (defun queue-result () (let ((result (call-with-foo (lambda (foo) (pop-result foo))))) (enqueue result) result)) Assume that POP-RESULT, ENQUEUE, and all SBCL internals involved are interrupt safe. QUEUE-RESULT isn't -- at least not if we assume a destructive POP-RESULT. Should an interrupt arrive and unwind us at any point between the pop and the ENQUEUE call, we have lost the result and cannot ever recover it. So, you interrupt proof QUEUE-RESULT... and it's still not enough, because its users need to be interrupt proofed as well. Many people try to use WITH-TIMEOUT to unwind through unknown code (eg. they stick it in their toplevel query handler to abort a query that takes too long). For this to be safe every single function that gets called needs to be interrupt safe -- and if your system has any state, chances are that it isn't. Even if you go through the herculean effort of interrupt proofing everything, it is terribly easy to break it again. This is like thread-safety, but worse. The only safe & sane place to use WITH-TIMEOUT is when you _know_ that all code that runs with interrupts enabled inside the WITH-TIMEOUT is interrupt-proofed, and you handle the timeout condition locally so the asynch unwind doesn't escape the lexical contour. I haven't run across one like this in real code yet, though I'm certain such uses exists. For the other cases (need to unwind through unknown code) you _must_ use synchronous timeouts. That is, functions which may take arbitrary amounts of time must be written to accept a timeout / deadline argument -- which you then check as appropriate. This is a lot more work than just throwing WITH-TIMEOUT around a call to be sure! ...it is however actually doable unlike interrupt proofing everything: you only need to deal with blocking sites and potentially bad loops. (end rant) I'd really appreciate if someone knows how to make the WITH-TIMEOUT documentation _scary_ enough that people don't use it to unwind through random stuff. The current statement is both too broad and doesn't really tell how to use it safely. Cheers, -- Nikodemus |