If calling pthread_cond_signal from outside of the lock held by the waiter when waiting, pthread_cond_signal can deadlock.
This happens often with the thread pool implementation within LLVM, if LLVM is built with libstdc++ backed by winpthreads.
This can easily be observed with the attached C++11 test example (with pretty minimal use of C++ features though, if a plain C example is needed I can make one).
The code as is does this:
void ThreadPool::async(void (*Task)(void)) { // Lock the queue and push the new task pthread_mutex_lock(&QueueLock); Tasks.push(Task); pthread_mutex_unlock(&QueueLock); pthread_cond_signal(&QueueCondition); }
If the pthread_cond_signal
function call is moved before pthread_mutex_unlock
, the code runs fine.
When hung, the function calling pthread_cond_signal
is stuck in this function call:
EnterCriticalSection (&_c->waiters_count_lock_); ... else if (_c->waiters_count_ > _c->waiters_count_gone_) { r = do_sema_b_wait (_c->sema_b, 1, INFINITE,&_c->waiters_b_lock_,&_c->value_b);
While another waiting thread is stuck at:
r = do_sema_b_wait (_c->sema_b, 0, INFINITE,&_c->waiters_b_lock_,&_c->value_b); if (r != 0) return r; EnterCriticalSection (&_c->waiters_count_lock_); _c->waiters_count_++;
The statement
_c->waiters_count_++;
insidepthread_cond_timedwait_impl()
in 'cond.c' is not protected by a critical section, unlike a similar one inpthread_cond_wait()
. I am not sure about the original issue, but this is likely the reason.Last edit: LIU Hao 2019-04-25
No it didn't fix the deadlock. This patch seems to have worked it out, but I just couldn't understand it.
I recently encountered the same problem - however not when using pthread explicitely but the MinGW64
std::condition_variable
. A somewhat minimal example is attached.If there is anything I can do so a fix gets implemented - please let me know.
Other than that I would like to emphasize the criticality once more: This breaks
std
-library components!If the standard threading library is essential for you, I really really suggest you have a look at https://github.com/lhmouse/mcfgthread; prebuilt binaries are available at https://gcc-mcf.lhmouse.com/.
A small update here on my part: This christmas holiday I tried to set out and understand the source of my problem. However what I found instead:
- Using the Qt-provided MinGW 11.2.0 64-bit I was not able to reproduce the problems reported here [i.e. using ThreadPool-pthread.cpp and conditionVariableTest.cpp].
- Using the Qt-provided MinGW 8.1.0 64-bit both programs still fail.
According to https://wiki.qt.io/MinGW the respective versions are:
- 11.2.0: mingw-builds x86_64-11.2.0-release-posix-seh-rt_v9-rev1
- 8.1.0: mingw-builds x86_64-8.1.0-release-posix-seh-rt_v6-rev0
Can anyone explain to me, how I can backtrace to the respective source commit?
And as a side-node: How does this numbering correlate to anything in this git-repo? tags are different [v8.0.1 exists - but v10.0.0 seems to be the latest....]?
I would like to find out what fixed above problems...
But regardless: Great it was fixed! As for me, this issue seems closeable.
You can have a look at the history of
mingw-w64-libraries/winpthreads
. There haven't been many commits recently.I believe b2302cd60b94a6d68a0e18b3f886b82f4b7f22b7 and/or 330025c54b85512d54b6960fad07498365c8fee3 fixed this issue.
Regarding the build versions - those GCC based toolchain packages that you've got, are essentially a packaged combination of individual versions of GCC, binutils and mingw-w64. Those release packages seem to use the version of GCC as the public version number. It may be possible to track down what version of binutils and mingw-w64 they were built with from the version control system where it was built, or then it may have used whichever version of mingw-w64 was the latest from git master when the package was built.
Marking the bug as fixed.