From: Paul K. <pv...@pv...> - 2013-08-28 19:56:54
|
Hi all, I've recently had the pleasure of helping Stas trace concurrency bugs on PPC. I'm also following the semaphore-notification failures from afar. Sadly, I lack the time to look at this deeply. I did manage to take a second look at the code, and I find the logic for semaphores incredibly convoluted. Waitqueues are only slightly better. It seems to me a lot of the complexity comes from async unwind safety (timeouts don't help either). A lock-free multi-word compare-and-swap (MCAS) would take care of that. "Just" specify a set of location, their expected values and their new ones. The implementation is lock-free (via normal CAS), and thus an unwind doesn't break anything: the operation will eventually succeed or fail, without locking any other thread out. However, all the schemes I've seen need dynamic memory allocation, and thus (realistically, for us) GC. A simpler option is to implement an MCAS on top of spinlocks and without-interrupts. This also gives us more freedom to compute the update values, but means that we have more without-interrupt logic to check… one upside is that that logic would be centralised. Still in terms of ease of implementation, I believe that it'll be much more easy to develop a trustworthy spinlocky multi-word read-modify-write macro than a lock-free MCAS. Thoughts on rewriting our constructs in such a way, instead of wrapping user-facing functions in without-interrupts and checking for timeouts in all the right places? Or on this trade-off between complexity and lack of allocation, versus robustness/lock-freedom? Paul Khuong |
From: Jan M. <jmo...@te...> - 2013-08-31 15:46:00
|
On Wed, 2013-08-28 at 15:56 -0400, Paul Khuong wrote: > I've recently had the pleasure of helping Stas trace concurrency bugs > on PPC. I'm also following the semaphore-notification failures from > afar. Sadly, I lack the time to look at this deeply. I did manage to > take a second look at the code, and I find the logic for semaphores > incredibly convoluted. Waitqueues are only slightly better. Having worked on the semaphore-notification problem a bit, I can definitely confirm that. > Thoughts on rewriting our constructs in such a way, instead of wrapping > user-facing functions in without-interrupts and checking for timeouts in > all the right places? Or on this trade-off between complexity and lack > of allocation, versus robustness/lock-freedom? I don't feel competent to comment on the actual subject, but since nobody else had anything to say, I would like to suggest adding this proposal to "Project Ideas" list on sbcl.org. That way, the idea will not be forgotten may even be picked up by someone looking for an SBCL-related project. Kind regards, Jan |