14:16:01 <jsnell> so I didn't do a release over a weekend since
thread.pure.lisp has started failing on x86-64 linux in different ways
on two different machines, and didn't manage to debug what was going
14:17:51 <jsnell> though it seemed to be some kind of leakage between
tests (commenting out a totally unrelated and passing test made it
14:29:23 <|3b|> which test is failing?
14:30:26 <jsnell> :wait-on-semaphore :timeout :many-threads
14:31:12 <jsnell> and commenting out symbol-value-in-thread.3 (or
reducing the number of threads it creates from 15000 to 150) make it
14:32:04 <|3b|> any resource limits that might be affecting it?
14:34:32 <jsnell> don't think so, it's not crashing. it's just failing
to wake some of the threads. (and symbol-value-in-thread.3 doesn't
make 15000 concurrent threads, it joins each thread before creating
the next one)
I haven't been able to replicate the failure yet, but both tests use
RANDOM, which I'm guessing is the source of the leak -- but the
semaphore test really doesn't look robust at all.
I don't quite understand "failing to wake" bit, though. Are you sure
that some threads aren't just giving up on the semaphore before the
(loop repeat 5
do (signal-semaphore sem 2))
bit runs? I would expect that extending the timeout to, say, 1 second
should make the test pass reliably.