On Sun, Mar 18, 2012 at 11:19 PM, Matthew Mondor <firstname.lastname@example.org>
When I had done the initial benchmarking, what seemed to be the problem
was that a thread waiting for a lock wouldn't wake up as fast as
previously when the lock became available. The wakeup mechanism
probably helps but it's plausible that threads waiting on a lock still
don't wake up as fast as with the pthreads implementation, which would
likely explain the difference.
I have changed the implementation using now a FIFO queue. The times seem to improve a lot for short tests, going up to 731 connections / s or 1.4 ms / connection on average. For longer tests it degrades a bit and goes up to 1.9 ms, which I attribute to consing.
The queue is based on a spinlock and I believe the FIFO character plus the fact that the waiters spinlock with waiting times that are at most 0.1s should provide enough of a balance not to make it too unfair. But to be honest, I have not done any research on how to make this theoretically sound.
As potential improvements I see:
* Changing the queue so that it does not cons (perhaps with a "next" field in the process object itself)
* The queue has the format multiple produces - one consumer, meaning that it can be implemented without a lock (just CAS).