I've recently had the pleasure of helping Stas trace concurrency bugs
on PPC. I'm also following the semaphore-notification failures from
afar. Sadly, I lack the time to look at this deeply. I did manage to
take a second look at the code, and I find the logic for semaphores
incredibly convoluted. Waitqueues are only slightly better.
It seems to me a lot of the complexity comes from async unwind safety
(timeouts don't help either). A lock-free multi-word compare-and-swap
(MCAS) would take care of that. "Just" specify a set of location, their
expected values and their new ones. The implementation is lock-free (via
normal CAS), and thus an unwind doesn't break anything: the operation
will eventually succeed or fail, without locking any other thread out.
However, all the schemes I've seen need dynamic memory allocation, and
thus (realistically, for us) GC. A simpler option is to implement an
MCAS on top of spinlocks and without-interrupts. This also gives us more
freedom to compute the update values, but means that we have more
without-interrupt logic to check… one upside is that that logic would be
Still in terms of ease of implementation, I believe that it'll be much
more easy to develop a trustworthy spinlocky multi-word
read-modify-write macro than a lock-free MCAS.
Thoughts on rewriting our constructs in such a way, instead of wrapping
user-facing functions in without-interrupts and checking for timeouts in
all the right places? Or on this trade-off between complexity and lack
of allocation, versus robustness/lock-freedom?