MinGW-w64 - for 32 and 64 bit Windows / Bugs / #774 winpthreads can deadlock if pthread_cond_signal is called outside of the mutex lock

#774 winpthreads can deadlock if pthread_cond_signal is called outside of the mutex lock

Milestone: v1.0 (example)

Status: closed-fixed

Owner: nobody

Labels: None

Priority: 5

Updated: 2022-12-29

Created: 2018-11-30

Creator: Martin Storsjö

Private: No

If calling pthread_cond_signal from outside of the lock held by the waiter when waiting, pthread_cond_signal can deadlock.

This happens often with the thread pool implementation within LLVM, if LLVM is built with libstdc++ backed by winpthreads.

This can easily be observed with the attached C++11 test example (with pretty minimal use of C++ features though, if a plain C example is needed I can make one).

The code as is does this:

void ThreadPool::async(void (*Task)(void)) {
  // Lock the queue and push the new task
  pthread_mutex_lock(&QueueLock);

  Tasks.push(Task);
  pthread_mutex_unlock(&QueueLock);
  pthread_cond_signal(&QueueCondition);
}

If the pthread_cond_signal function call is moved before pthread_mutex_unlock, the code runs fine.

When hung, the function calling pthread_cond_signal is stuck in this function call:

  EnterCriticalSection (&_c->waiters_count_lock_);
...
else if (_c->waiters_count_ > _c->waiters_count_gone_)
    {
      r = do_sema_b_wait (_c->sema_b, 1, INFINITE,&_c->waiters_b_lock_,&_c->value_b);

While another waiting thread is stuck at:

  r = do_sema_b_wait (_c->sema_b, 0, INFINITE,&_c->waiters_b_lock_,&_c->value_b);
  if (r != 0)
    return r;
  EnterCriticalSection (&_c->waiters_count_lock_);
  _c->waiters_count_++;

1 Attachments

ThreadPool-pthread.cpp

Discussion

LIU Hao - 2019-04-25

The statement _c->waiters_count_++; inside pthread_cond_timedwait_impl() in 'cond.c' is not protected by a critical section, unlike a similar one in pthread_cond_wait(). I am not sure about the original issue, but this is likely the reason.

Last edit: LIU Hao 2019-04-25

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LIU Hao - 2019-04-25

No it didn't fix the deadlock. This patch seems to have worked it out, but I just couldn't understand it.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Johannes Wilde - 2022-08-08

I recently encountered the same problem - however not when using pthread explicitely but the MinGW64 std::condition_variable. A somewhat minimal example is attached.

If there is anything I can do so a fix gets implemented - please let me know.

Other than that I would like to emphasize the criticality once more: This breaks std-library components!

conditionVariableTest.cpp

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LIU Hao - 2022-08-08

If the standard threading library is essential for you, I really really suggest you have a look at https://github.com/lhmouse/mcfgthread; prebuilt binaries are available at https://gcc-mcf.lhmouse.com/.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Johannes Wilde - 2022-12-28

A small update here on my part: This christmas holiday I tried to set out and understand the source of my problem. However what I found instead:
- Using the Qt-provided MinGW 11.2.0 64-bit I was not able to reproduce the problems reported here [i.e. using ThreadPool-pthread.cpp and conditionVariableTest.cpp].
- Using the Qt-provided MinGW 8.1.0 64-bit both programs still fail.

According to https://wiki.qt.io/MinGW the respective versions are:
- 11.2.0: mingw-builds x86_64-11.2.0-release-posix-seh-rt_v9-rev1
- 8.1.0: mingw-builds x86_64-8.1.0-release-posix-seh-rt_v6-rev0

Can anyone explain to me, how I can backtrace to the respective source commit?

And as a side-node: How does this numbering correlate to anything in this git-repo? tags are different [v8.0.1 exists - but v10.0.0 seems to be the latest....]?

I would like to find out what fixed above problems...

But regardless: Great it was fixed! As for me, this issue seems closeable.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LIU Hao - 2022-12-29

You can have a look at the history of mingw-w64-libraries/winpthreads. There haven't been many commits recently.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Storsjö - 2022-12-29

I believe b2302cd60b94a6d68a0e18b3f886b82f4b7f22b7 and/or 330025c54b85512d54b6960fad07498365c8fee3 fixed this issue.

Regarding the build versions - those GCC based toolchain packages that you've got, are essentially a packaged combination of individual versions of GCC, binutils and mingw-w64. Those release packages seem to use the version of GCC as the public version number. It may be possible to track down what version of binutils and mingw-w64 they were built with from the version control system where it was built, or then it may have used whichever version of mingw-w64 was the latest from git master when the package was built.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Storsjö - 2022-12-29

status: open --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Martin Storsjö - 2022-12-29

Marking the bug as fixed.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.