From: Nikodemus S. <nik...@ra...> - 2006-09-23 12:54:03
|
The following patch fixes two potential GC deadlocks caused by dying threads grabbing locks while GC is inhibted, and provides an imperfect test-case for this. * Dying threads used to grab session and all-threads locks with GC inhibited, which was bad: 1. T1 has the lock, GC not inhibited T2 in HANDLE-THREAD-EXIT waiting for the lock, GC inhibited 2. GC is triggered 3. T1 stopped while holding the lock T2 dealocks waiting for T1 to release the lock. * Mark threads dead while holding the *ALL-THREADS-LOCK*, so that (unless (thread-alive-p th) (assert (not (member th (list-all-threads))))) cannot fail. * Move all thread cleanup logic to HANDLE-THREAD exit. Unless there are objections I'll commit this despite the freeze on Sunday. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." Index: src/code/target-thread.lisp =================================================================== RCS file: /cvsroot/sbcl/sbcl/src/code/target-thread.lisp,v retrieving revision 1.60 diff -u -r1.60 target-thread.lisp --- src/code/target-thread.lisp 18 Sep 2006 20:09:13 -0000 1.60 +++ src/code/target-thread.lisp 23 Sep 2006 12:33:20 -0000 @@ -472,16 +472,27 @@ ;;; Remove thread from its session, if it has one. #!+sb-thread (defun handle-thread-exit (thread) + (/show0 "HANDLING THREAD EXIT") + ;; We need grap the locks before inhibiting GC to avoid + ;; GC dealocks -- and since there seems to be no point in + ;; holding the locks after lisp-side has been cleaned up, + ;; we release them too. (with-mutex (*all-threads-lock*) - (/show0 "HANDLING THREAD EXIT") + (setf (thread-%alive-p thread) nil) + (setf (thread-os-thread thread) nil) + (setq *all-threads* (delete thread *all-threads*)) + (when *session* + (%delete-thread-from-session thread *session*))) + ;; We're going down, can't handle interrupts sanely anymore, and + ;; GC needs to be inhibited in order to avoid it seeing dying + ;; threads. + (let ((sb!impl::*gc-inhibit* t)) + (block-blockable-signals) #!+sb-lutex (when (thread-interruptions-lock thread) (/show0 "FREEING MUTEX LUTEX") (with-lutex-address (lutex (mutex-lutex (thread-interruptions-lock thread))) - (%lutex-destroy lutex))) - (setq *all-threads* (delete thread *all-threads*))) - (when *session* - (%delete-thread-from-session thread *session*))) + (%lutex-destroy lutex))))) (defun terminate-session () #!+sb-doc @@ -644,15 +655,7 @@ ;; threads, it's time to enable signals (sb!unix::reset-signal-mask) (funcall real-function)) - ;; we're going down, can't handle - ;; interrupts sanely anymore - (let ((sb!impl::*gc-inhibit* t)) - (block-blockable-signals) - (setf (thread-%alive-p thread) nil) - (setf (thread-os-thread thread) nil) - ;; and remove what can be the last - ;; reference to this thread - (handle-thread-exit thread))))))) + (handle-thread-exit thread)))))) (values)))) ;; Keep INITIAL-FUNCTION pinned until the child thread is ;; initialized properly. Index: tests/threads.impure.lisp =================================================================== RCS file: /cvsroot/sbcl/sbcl/tests/threads.impure.lisp,v retrieving revision 1.43 diff -u -r1.43 threads.impure.lisp --- tests/threads.impure.lisp 13 Sep 2006 21:37:29 -0000 1.43 +++ tests/threads.impure.lisp 23 Sep 2006 12:33:20 -0000 @@ -670,3 +670,42 @@ (wait-for-threads threads))) (format t "backtrace test done~%") + +(format t "~&starting gc deadlock test: WARNING: THIS TEST WILL HANG ON FAILURE!~%") + +(with-test (:name (:gc-deadlock)) + ;; Prior to 0.9.16.46 thread exit potentially deadlocked the + ;; GC due to *all-threads-lock* and session lock. On earlier + ;; versions and at least on one specific box this test is good enough + ;; to catch that typically well before the 1500th iteration. + (loop + with i = 0 + with n = 3000 + while (< i n) + do + (incf i) + (when (zerop (mod i 100)) + (write-char #\.) + (force-output)) + (handler-case + (if (oddp i) + (sb-thread:make-thread + (lambda () + (sleep (random 0.001))) + :name (list :sleep i)) + (sb-thread:make-thread + (lambda () + ;; KLUDGE: what we are doing here is explicit, + ;; but the same can happen because of a regular + ;; MAKE-THREAD or LIST-ALL-THREADS, and various + ;; session functions. + (sb-thread:with-mutex (sb-thread::*all-threads-lock*) + (sb-thread::with-session-lock (sb-thread::*session*) + (sb-ext:gc)))) + :name (list :gc i))) + (error (e) + (format t "~%error creating thread ~D: ~A -- backing off for retry~%" i e) + (sleep 0.1) + (incf i))))) + +(format t "~&gc deadlock test done~%") |
From: <me...@re...> - 2006-09-23 15:13:11
|
On Saturday 23 September 2006 14:53, Nikodemus Siivola wrote: > The following patch fixes two potential GC deadlocks caused by dying > threads grabbing locks while GC is inhibted, and provides an > imperfect test-case for this. > > * Dying threads used to grab session and all-threads locks > with GC inhibited, which was bad: > 1. T1 has the lock, GC not inhibited > T2 in HANDLE-THREAD-EXIT waiting for the lock, GC inhibited > 2. GC is triggered > 3. T1 stopped while holding the lock > T2 dealocks waiting for T1 to release the lock. > * Mark threads dead while holding the *ALL-THREADS-LOCK*, so that > (unless (thread-alive-p th) > (assert (not (member th (list-all-threads))))) > cannot fail. Sure, but what about (assert (thread-alive-p *current-thread*))? See=20 below. > Index: src/code/target-thread.lisp > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS file: /cvsroot/sbcl/sbcl/src/code/target-thread.lisp,v > retrieving revision 1.60 > diff -u -r1.60 target-thread.lisp > --- src/code/target-thread.lisp 18 Sep 2006 20:09:13 -0000 1.60 > +++ src/code/target-thread.lisp 23 Sep 2006 12:33:20 -0000 > @@ -472,16 +472,27 @@ > ;;; Remove thread from its session, if it has one. > #!+sb-thread > (defun handle-thread-exit (thread) > + (/show0 "HANDLING THREAD EXIT") > + ;; We need grap the locks before inhibiting GC to avoid > + ;; GC dealocks -- and since there seems to be no point in > + ;; holding the locks after lisp-side has been cleaned up, > + ;; we release them too. > (with-mutex (*all-threads-lock*) > - (/show0 "HANDLING THREAD EXIT") > + (setf (thread-%alive-p thread) nil) > + (setf (thread-os-thread thread) nil) At this point interrupt can arrive and the handler will be run in a=20 thread that's not alive which is bad. Furthermore gcs triggerred here=20 by for instance this DELETE: > + (setq *all-threads* (delete thread *all-threads*)) will run finalizers in this thread that appears dead but it is still in=20 the all-threads list. > + (when *session* > + (%delete-thread-from-session thread *session*))) > + ;; We're going down, can't handle interrupts sanely anymore, and > + ;; GC needs to be inhibited in order to avoid it seeing dying > + ;; threads. > + (let ((sb!impl::*gc-inhibit* t)) > + (block-blockable-signals) > #!+sb-lutex > (when (thread-interruptions-lock thread) > (/show0 "FREEING MUTEX LUTEX") > (with-lutex-address (lutex (mutex-lutex > (thread-interruptions-lock thread))) - (%lutex-destroy > lutex))) > - (setq *all-threads* (delete thread *all-threads*))) > - (when *session* > - (%delete-thread-from-session thread *session*))) > + (%lutex-destroy lutex))))) > There a few things to do when a thread is exitting: 1) (setf (thread-%alive-p thread) nil) 2) remove it from *all-threads* 3) set its C side state to DEAD 4) clean up some gc tables Observations: I) one must get the C side all_threads_lock for 4). II) 3) must be done after I), else sig_stop_for_gc_handler is confused III) (assert (thread-alive-p *current-thread*) shall hold in user code IV) same for (unless (thread-alive-p t) (assert (not (member th (list-all-threads))))) V) to satisfy III and IV deferrable interrupts must be disabled before 1=20 and 2 VI) gc must not be triggerred after 1 becuase finalizers are then run in=20 a dead thread. But since running finalizers in any user thread is quite=20 problematic anyway because of locks and such, finalizers should =20 probably be in separate thread(s) making this point important. =46rom these I derive the following implementation: * block deferrable interrupts (sig_stop_for_gc stays enabled) * (setf (thread-%alive-p thread) nil) * remove it from *all-threads* Since gc is not inhibited nor is sig_stop_for_gc blocked we won't=20 deadlock due to gc. And if *all-threads-lock* is always acquired with=20 interrupts disabled that's one less thing to worry about. In this schem gc is enabled while on the lisp side: sig_stop_for_gc is=20 blocked just before state=3DDEAD in C. Then finalizers are moved to a separate thread. Cheers,=20 G |
From: Nikodemus S. <nik...@ra...> - 2006-09-23 17:21:11
|
Gábor Melis <me...@re...> writes: >> * Dying threads used to grab session and all-threads locks >> with GC inhibited, which was bad: >> 1. T1 has the lock, GC not inhibited >> T2 in HANDLE-THREAD-EXIT waiting for the lock, GC inhibited >> 2. GC is triggered >> 3. T1 stopped while holding the lock >> T2 dealocks waiting for T1 to release the lock. >> * Mark threads dead while holding the *ALL-THREADS-LOCK*, so that >> (unless (thread-alive-p th) >> (assert (not (member th (list-all-threads))))) >> cannot fail. > > Sure, but what about (assert (thread-alive-p *current-thread*))? See > below. I knew this was too nice to be true. :/ Attached a new patch, using the strategy you outlined, except for a minor kludge: Instead of a separate finalizer thread, which we would then need to deal with in at least GC-AND-SAVE and QUIT, simply skip finalizers and after-gc hooks in GCs triggered by dying threads. I'm not 100% sure why interrupts need to be disabled for *all-threads-lock*, though -- can you expand on that? I would have thought that blocking deferrables after grabbing the lock would be sufficient. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." Index: version.lisp-expr =================================================================== RCS file: /cvsroot/sbcl/sbcl/version.lisp-expr,v retrieving revision 1.3009 diff -u -r1.3009 version.lisp-expr --- version.lisp-expr 20 Sep 2006 18:18:19 -0000 1.3009 +++ version.lisp-expr 23 Sep 2006 17:17:17 -0000 @@ -17,4 +17,4 @@ ;;; checkins which aren't released. (And occasionally for internal ;;; versions, especially for internal versions off the main CVS ;;; branch, it gets hairier, e.g. "0.pre7.14.flaky4.13".) -"0.9.16.45" +"0.9.16.45.gc-deadlocks" Index: src/code/gc.lisp =================================================================== RCS file: /cvsroot/sbcl/sbcl/src/code/gc.lisp,v retrieving revision 1.70 diff -u -r1.70 gc.lisp --- src/code/gc.lisp 5 Jun 2006 12:25:55 -0000 1.70 +++ src/code/gc.lisp 23 Sep 2006 17:17:17 -0000 @@ -139,8 +139,9 @@ ;;;; GC hooks (defvar *after-gc-hooks* nil - "Called after each garbage collection. In a multithreaded -environment these hooks may run in any thread.") + "Called after each garbage collection, except for garbage collections +triggered during thread exits. In a multithreaded environment these hooks may +run in any thread.") ;;;; internal GC @@ -216,12 +217,16 @@ ;; ;; Can that be avoided by having the finalizers and hooks run only ;; from the outermost SUB-GC? - (run-pending-finalizers) - (dolist (hook *after-gc-hooks*) - (handler-case - (funcall hook) - (error (c) - (warn "Error calling after GC hook ~S:~% ~S" hook c))))))) + ;; + ;; KLUDGE: Don't run the hooks in GC's triggered by dying threads, + ;; so that user-code never runs in a dying thread. + (when (sb!thread:thread-alive-p sb!thread:*current-thread*) + (run-pending-finalizers) + (dolist (hook *after-gc-hooks*) + (handler-case + (funcall hook) + (error (c) + (warn "Error calling after GC hook ~S:~% ~A" hook c)))))))) ;;; This is the user-advertised garbage collection function. (defun gc (&key (gen 0) (full nil) &allow-other-keys) Index: src/code/target-thread.lisp =================================================================== RCS file: /cvsroot/sbcl/sbcl/src/code/target-thread.lisp,v retrieving revision 1.60 diff -u -r1.60 target-thread.lisp --- src/code/target-thread.lisp 18 Sep 2006 20:09:13 -0000 1.60 +++ src/code/target-thread.lisp 23 Sep 2006 17:17:17 -0000 @@ -57,10 +57,18 @@ (defvar *all-threads* ()) (defvar *all-threads-lock* (make-mutex :name "all threads lock")) +(defmacro with-all-threads-lock (&body body) + #!-sb-thread + `(locally ,@body) + #!+sb-thread + `(without-interrupts + (with-mutex (*all-threads-lock*) + ,@body))) + (defun list-all-threads () #!+sb-doc "Return a list of the live threads." - (with-mutex (*all-threads-lock*) + (with-all-threads-lock (copy-list *all-threads*))) (declaim (inline current-thread-sap)) @@ -97,7 +105,7 @@ (define-alien-routine "signal_interrupt_thread" integer (os-thread unsigned-long)) - (define-alien-routine "block_blockable_signals" + (define-alien-routine "block_deferrable_signals" void) #!+sb-lutex @@ -472,16 +480,22 @@ ;;; Remove thread from its session, if it has one. #!+sb-thread (defun handle-thread-exit (thread) - (with-mutex (*all-threads-lock*) - (/show0 "HANDLING THREAD EXIT") - #!+sb-lutex - (when (thread-interruptions-lock thread) - (/show0 "FREEING MUTEX LUTEX") - (with-lutex-address (lutex (mutex-lutex (thread-interruptions-lock thread))) - (%lutex-destroy lutex))) - (setq *all-threads* (delete thread *all-threads*))) - (when *session* - (%delete-thread-from-session thread *session*))) + (/show0 "HANDLING THREAD EXIT") + ;; We're going down, can't handle interrupts sanely anymore. + ;; GC remains enabled. + (block-deferrable-signals) + ;; Lisp-side cleanup + (with-all-threads-lock + (setf (thread-%alive-p thread) nil) + (setf (thread-os-thread thread) nil) + (setq *all-threads* (delete thread *all-threads*)) + (when *session* + (%delete-thread-from-session thread *session*))) + #!+sb-lutex + (when (thread-interruptions-lock thread) + (/show0 "FREEING MUTEX LUTEX") + (with-lutex-address (lutex (mutex-lutex (thread-interruptions-lock thread))) + (%lutex-destroy lutex)))) (defun terminate-session () #!+sb-doc @@ -621,7 +635,7 @@ (sb!impl::*internal-symbol-output-fun* nil) (sb!impl::*descriptor-handlers* nil)) ; serve-event (setf (thread-os-thread thread) (current-thread-sap-id)) - (with-mutex (*all-threads-lock*) + (with-all-threads-lock (push thread *all-threads*)) (with-session-lock (*session*) (push thread (session-threads *session*))) @@ -644,15 +658,7 @@ ;; threads, it's time to enable signals (sb!unix::reset-signal-mask) (funcall real-function)) - ;; we're going down, can't handle - ;; interrupts sanely anymore - (let ((sb!impl::*gc-inhibit* t)) - (block-blockable-signals) - (setf (thread-%alive-p thread) nil) - (setf (thread-os-thread thread) nil) - ;; and remove what can be the last - ;; reference to this thread - (handle-thread-exit thread))))))) + (handle-thread-exit thread)))))) (values)))) ;; Keep INITIAL-FUNCTION pinned until the child thread is ;; initialized properly. Index: src/runtime/interrupt.c =================================================================== RCS file: /cvsroot/sbcl/sbcl/src/runtime/interrupt.c,v retrieving revision 1.113 diff -u -r1.113 interrupt.c --- src/runtime/interrupt.c 7 Jun 2006 16:25:10 -0000 1.113 +++ src/runtime/interrupt.c 23 Sep 2006 17:17:17 -0000 @@ -186,6 +186,14 @@ #endif } +void +block_deferrable_signals(void) +{ +#ifndef LISP_FEATURE_WIN32 + thread_sigmask(SIG_BLOCK, &deferrable_sigset, 0); +#endif +} + /* * utility routines used by various signal handlers Index: src/runtime/thread.c =================================================================== RCS file: /cvsroot/sbcl/sbcl/src/runtime/thread.c,v retrieving revision 1.66 diff -u -r1.66 thread.c --- src/runtime/thread.c 18 Sep 2006 20:09:14 -0000 1.66 +++ src/runtime/thread.c 23 Sep 2006 17:17:17 -0000 @@ -224,6 +224,7 @@ { lispobj function; int result, lock_ret; + FSHOW((stderr,"/creating thread %lu\n", thread_self())); function = th->no_tls_value_marker; th->no_tls_value_marker = NO_TLS_VALUE_MARKER_WIDETAG; @@ -245,6 +246,9 @@ gc_assert(lock_ret == 0); result = funcall0(function); + + /* Block GC */ + block_blockable_signals(); th->state=STATE_DEAD; /* SIG_STOP_FOR_GC is blocked and GC might be waiting for this Index: tests/threads.impure.lisp =================================================================== RCS file: /cvsroot/sbcl/sbcl/tests/threads.impure.lisp,v retrieving revision 1.43 diff -u -r1.43 threads.impure.lisp --- tests/threads.impure.lisp 13 Sep 2006 21:37:29 -0000 1.43 +++ tests/threads.impure.lisp 23 Sep 2006 17:17:17 -0000 @@ -670,3 +670,42 @@ (wait-for-threads threads))) (format t "backtrace test done~%") + +(format t "~&starting gc deadlock test: WARNING: THIS TEST WILL HANG ON FAILURE!~%") + +(with-test (:name (:gc-deadlock)) + ;; Prior to 0.9.16.46 thread exit potentially deadlocked the + ;; GC due to *all-threads-lock* and session lock. On earlier + ;; versions and at least on one specific box this test is good enough + ;; to catch that typically well before the 1500th iteration. + (loop + with i = 0 + with n = 3000 + while (< i n) + do + (incf i) + (when (zerop (mod i 100)) + (write-char #\.) + (force-output)) + (handler-case + (if (oddp i) + (sb-thread:make-thread + (lambda () + (sleep (random 0.001))) + :name (list :sleep i)) + (sb-thread:make-thread + (lambda () + ;; KLUDGE: what we are doing here is explicit, + ;; but the same can happen because of a regular + ;; MAKE-THREAD or LIST-ALL-THREADS, and various + ;; session functions. + (sb-thread:with-mutex (sb-thread::*all-threads-lock*) + (sb-thread::with-session-lock (sb-thread::*session*) + (sb-ext:gc)))) + :name (list :gc i))) + (error (e) + (format t "~%error creating thread ~D: ~A -- backing off for retry~%" i e) + (sleep 0.1) + (incf i))))) + +(format t "~&gc deadlock test done~%") |
From: <me...@re...> - 2006-09-23 21:14:31
|
On Saturday 23 September 2006 19:20, Nikodemus Siivola wrote: > G=C3=A1bor Melis <me...@re...> writes: > > Sure, but what about (assert (thread-alive-p *current-thread*))? > > See below. > > I knew this was too nice to be true. :/ > > Attached a new patch, using the strategy you outlined, except for a > minor kludge: > > Instead of a separate finalizer thread, which we would then need to > deal with in at least GC-AND-SAVE and QUIT, simply skip finalizers > and after-gc hooks in GCs triggered by dying threads. Good kludge, but the finalizer thread is need anyway. > I'm not 100% sure why interrupts need to be disabled for > *all-threads-lock*, though -- can you expand on that? I would have > thought that blocking deferrables after grabbing the lock would be > sufficient. =46or what it is worth the patch seems good to me. But I got this wrong at= =20 least twice. As to the interrupts and *all-threads-lock* I have no idea what I had in=20 mind, but can now only think of this: a signal handler may want to=20 spawn a thread and if *all-threads-lock* is held things go awry. I'd=20 prefer some of the fundamental machinery (sessions, threads) to be=20 interrupt safe if only for the stability of the development environment=20 with C-c happy users. > Cheers, > > -- Nikodemus Schemer: "Buddha is small, clean, and > serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." |
From: Nikodemus S. <nik...@ra...> - 2006-09-24 08:42:18
|
Gábor Melis <me...@re...> writes: > Good kludge, but the finalizer thread is need anyway. I agree (finally), though since after-gc-hooks need to go there too, more like a "housekeeping thread". ;-) >> I'm not 100% sure why interrupts need to be disabled for >> *all-threads-lock*, though -- can you expand on that? I would have >> thought that blocking deferrables after grabbing the lock would be >> sufficient. > > For what it is worth the patch seems good to me. But I got this wrong at > least twice. Well, given a test-case that fails with CVS HEAD and doesn't with the patch, I'm inclinined to call is "clearly better then what we have". So again, unless objections arise, I'll commit this the last thing today or first thing on monday. > As to the interrupts and *all-threads-lock* I have no idea what I had in > mind, but can now only think of this: a signal handler may want to > spawn a thread and if *all-threads-lock* is held things go awry. I'd > prefer some of the fundamental machinery (sessions, threads) to be > interrupt safe if only for the stability of the development environment > with C-c happy users. Makes sense to me. Incidentally, being a C-c happy user myself, I'm inclined to use WITH-RECURSIVE-LOCK almost exclusively in all the interruptible parts, and have been wondering if mutexes should be normally recursive. This explanation from a member of the POSIX working group seems to speak against that: http://zaval.org/resources/library/butenhof1.html ...but because we have the debugger, things are even more hairy. Dunno what would be right and good, but worth thinking about. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." |
From: Christophe R. <cs...@ca...> - 2006-09-24 09:16:50
|
Nikodemus Siivola <nik...@ra...> writes: > Well, given a test-case that fails with CVS HEAD and doesn't with the > patch, I'm inclinined to call is "clearly better then what we have". > > So again, unless objections arise, I'll commit this the last thing > today or first thing on monday. This seems to defeat the purpose of a freeze. The counterargument is that if it's been wrong for this long, then being wrong for a little more won't hurt; on the other hand, people testing the frozen code now know that what they're testing is close to what is to be released. Is there a reason that this has to be in the upcoming release, rather than the one in a month's time? (in CVS today, rather than in CVS on Wednesday or so?) Cheers, Christophe |
From: <me...@re...> - 2006-09-24 10:49:11
|
On Sunday 24 September 2006 10:41, Nikodemus Siivola wrote: > G=C3=A1bor Melis <me...@re...> writes: > > As to the interrupts and *all-threads-lock* I have no idea what I > > had in mind, but can now only think of this: a signal handler may > > want to spawn a thread and if *all-threads-lock* is held things go > > awry. I'd prefer some of the fundamental machinery (sessions, > > threads) to be interrupt safe if only for the stability of the > > development environment with C-c happy users. > > Makes sense to me. Incidentally, being a C-c happy user myself, I'm > inclined to use WITH-RECURSIVE-LOCK almost exclusively in all the > interruptible parts, and have been wondering if mutexes should be > normally recursive. > > This explanation from a member of the POSIX working group seems to > speak against that: > > http://zaval.org/resources/library/butenhof1.html > > ...but because we have the debugger, things are even more hairy. > Dunno what would be right and good, but worth thinking about. Wow, that's long. Certainly longer than my attention span. I may have written a paragraph on interrupt safety and locks in the=20 internal manuals. The short example I keep in my head is: (defun remove-x (something) (with-mutex (*x-lock*) (setq *x* (delete something *x*))) If an interrupt hits while holding *x-lock* and calls remove-x we=20 deadlock after printing a warning about recursive lock attempt (see=20 get-mutex). One way around it is to wrap the with-mutex in without-interrupts and=20 suffer style point deduction. One could also think that with-recursive lock is better because it=20 doesn't deadlock in the above case but it doesn't produce correct=20 results: one of the things to be removed is likely still in *x*. > Cheers, Gabor > > -- Nikodemus Schemer: "Buddha is small, clean, and > serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." |
From: Nikodemus S. <nik...@ra...> - 2006-09-24 10:37:32
|
Christophe Rhodes <cs...@ca...> writes: > This seems to defeat the purpose of a freeze. The counterargument is > that if it's been wrong for this long, then being wrong for a little > more won't hurt; on the other hand, people testing the frozen code now > know that what they're testing is close to what is to be released. > > Is there a reason that this has to be in the upcoming release, rather > than the one in a month's time? (in CVS today, rather than in CVS on > Wednesday or so?) I would consider a GC deadlock critical enough to warrant it going in, especially since theoretically you need just MAKE-THREAD to trigger it. ...but objection noted. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." |
From: Nikodemus S. <nik...@ra...> - 2006-09-24 11:26:45
|
Gábor Melis <me...@re...> writes: >> http://zaval.org/resources/library/butenhof1.html > Wow, that's long. Certainly longer than my attention span. > > I may have written a paragraph on interrupt safety and locks in the > internal manuals. The short example I keep in my head is: > > (defun remove-x (something) > (with-mutex (*x-lock*) > (setq *x* (delete something *x*))) > > If an interrupt hits while holding *x-lock* and calls remove-x we > deadlock after printing a warning about recursive lock attempt (see > get-mutex). > > One way around it is to wrap the with-mutex in without-interrupts and > suffer style point deduction. > > One could also think that with-recursive lock is better because it > doesn't deadlock in the above case but it doesn't produce correct > results: one of the things to be removed is likely still in *x*. Quite. I think the main points in the article are: 1. If you think you need recursive locks your locking scheme is not well thought out. 2. Ok, sometimes you do want recursive locks, but you're still pissing away your chances of concurrency. 3. You should not be able to release a lock which you don't hold. 4. Debugging-mutexes which go an extra mile to detect funny stuff are a good idea. The way reason about these in SBCL context looks something like this: 1&2. If you are holding a mutex you should give it up before being interrupted, and since asynch unwinds are bad a non-recursive mutex should always be accompanied by without-interrupts. ...or you need a recursive lock and need to be fully reentrant. 3. RELEASE-MUTEX is nasty as it is currently written. OTOH, given an interactive environment, NUKE-MUTEX --or whatever-- can be handly. 4. Doing decent-effort deadlock detection in GET-MUTEX would not be particularly hard, just slow. Similarly, a slower mutex variant could enforce a gc-inhibit policy. Cheers, -- Nikodemus Schemer: "Buddha is small, clean, and serious." Lispnik: "Buddha is big, has hairy armpits, and laughs." |