Menu

#774 winpthreads can deadlock if pthread_cond_signal is called outside of the mutex lock

v1.0 (example)
closed-fixed
nobody
None
5
2022-12-29
2018-11-30
No

If calling pthread_cond_signal from outside of the lock held by the waiter when waiting, pthread_cond_signal can deadlock.

This happens often with the thread pool implementation within LLVM, if LLVM is built with libstdc++ backed by winpthreads.

This can easily be observed with the attached C++11 test example (with pretty minimal use of C++ features though, if a plain C example is needed I can make one).

The code as is does this:

void ThreadPool::async(void (*Task)(void)) {
  // Lock the queue and push the new task
  pthread_mutex_lock(&QueueLock);

  Tasks.push(Task);
  pthread_mutex_unlock(&QueueLock);
  pthread_cond_signal(&QueueCondition);
}

If the pthread_cond_signal function call is moved before pthread_mutex_unlock, the code runs fine.

When hung, the function calling pthread_cond_signal is stuck in this function call:

  EnterCriticalSection (&_c->waiters_count_lock_);
...
else if (_c->waiters_count_ > _c->waiters_count_gone_)
    {
      r = do_sema_b_wait (_c->sema_b, 1, INFINITE,&_c->waiters_b_lock_,&_c->value_b);

While another waiting thread is stuck at:

  r = do_sema_b_wait (_c->sema_b, 0, INFINITE,&_c->waiters_b_lock_,&_c->value_b);
  if (r != 0)
    return r;
  EnterCriticalSection (&_c->waiters_count_lock_);
  _c->waiters_count_++;
1 Attachments

Discussion

  • LIU Hao

    LIU Hao - 2019-04-25

    The statement _c->waiters_count_++; inside pthread_cond_timedwait_impl() in 'cond.c' is not protected by a critical section, unlike a similar one in pthread_cond_wait(). I am not sure about the original issue, but this is likely the reason.

     

    Last edit: LIU Hao 2019-04-25
  • LIU Hao

    LIU Hao - 2019-04-25

    No it didn't fix the deadlock. This patch seems to have worked it out, but I just couldn't understand it.

     
  • Johannes Wilde

    Johannes Wilde - 2022-08-08

    I recently encountered the same problem - however not when using pthread explicitely but the MinGW64 std::condition_variable. A somewhat minimal example is attached.

    If there is anything I can do so a fix gets implemented - please let me know.

    Other than that I would like to emphasize the criticality once more: This breaks std-library components!

     
  • LIU Hao

    LIU Hao - 2022-08-08

    If the standard threading library is essential for you, I really really suggest you have a look at https://github.com/lhmouse/mcfgthread; prebuilt binaries are available at https://gcc-mcf.lhmouse.com/.

     
  • Johannes Wilde

    Johannes Wilde - 2022-12-28

    A small update here on my part: This christmas holiday I tried to set out and understand the source of my problem. However what I found instead:
    - Using the Qt-provided MinGW 11.2.0 64-bit I was not able to reproduce the problems reported here [i.e. using ThreadPool-pthread.cpp and conditionVariableTest.cpp].
    - Using the Qt-provided MinGW 8.1.0 64-bit both programs still fail.

    According to https://wiki.qt.io/MinGW the respective versions are:
    - 11.2.0: mingw-builds x86_64-11.2.0-release-posix-seh-rt_v9-rev1
    - 8.1.0: mingw-builds x86_64-8.1.0-release-posix-seh-rt_v6-rev0

    Can anyone explain to me, how I can backtrace to the respective source commit?

    And as a side-node: How does this numbering correlate to anything in this git-repo? tags are different [v8.0.1 exists - but v10.0.0 seems to be the latest....]?

    I would like to find out what fixed above problems...

    But regardless: Great it was fixed! As for me, this issue seems closeable.

     
  • LIU Hao

    LIU Hao - 2022-12-29

    You can have a look at the history of mingw-w64-libraries/winpthreads. There haven't been many commits recently.

     
  • Martin Storsjö

    Martin Storsjö - 2022-12-29

    I believe b2302cd60b94a6d68a0e18b3f886b82f4b7f22b7 and/or 330025c54b85512d54b6960fad07498365c8fee3 fixed this issue.

    Regarding the build versions - those GCC based toolchain packages that you've got, are essentially a packaged combination of individual versions of GCC, binutils and mingw-w64. Those release packages seem to use the version of GCC as the public version number. It may be possible to track down what version of binutils and mingw-w64 they were built with from the version control system where it was built, or then it may have used whichever version of mingw-w64 was the latest from git master when the package was built.

     
  • Martin Storsjö

    Martin Storsjö - 2022-12-29
    • status: open --> closed-fixed
     
  • Martin Storsjö

    Martin Storsjö - 2022-12-29

    Marking the bug as fixed.

     

Log in to post a comment.