Threadpool does not handle changing number of threads
Brought to you by:
rwhaley,
tonyc040457
under certain circumstances, xsqrtst_pt will hang. When I investigated it, the first call to a thread routine only requires 2 threads, so only two get created. Later, a call to ATL_stgemm requires 8 threads. The goparallel writes all chkin[2], and then expects it to flip, which it never will.
This hangs:
./xsqrtst_pt -n 1 200 -n 1 108 -S 1 u -U 1 u -a 10 -f 0
This passes:
./xsqrtst_pt -n 1 200 -n 1 107 -S 1 u -U 1 u -a 10 -f 0
This code has FULLPOLL.
If I add a dummy DoWork, and a goparallel to bin/qrtst.c:
ATL_goparallel(8, Dummy, NULL, NULL);
This gives me an initial 8 threads instead of 2. Test then passes.
I was running some comparisons between atlas and openblas using the openblas/goto benchmark framework. It loops over different problem sizes. It hung when I hit n=84 for sgemm. Sure enough, if I start the process at n=84, I have no problems. If I start lower than that, the first time threads are used is n=82, and it starts only 3 threads.
I think this is a common problem.
Ticket moved from /p/math-atlas/support-requests/966/
This is confirmed as a bug. Should affect any threading operation that does not use all threads on first call (eg., could happen wt any threaded blas call).
Will need to fix ATL_InitThreadPoolStartup to always spawn NTHREAD, even if only P are used. Fix needs to be looked at a little more closely though, because you need to make sure the extra threads don't call DoWork.
I believe this has been fixed for 3.11.31.