parallel_for_each
always deadlocks (and parallel_for
sporadically deadlocks) in LLD built with Msys2 GCC 6.2 64-bit.
Knowing that threading implementation in this GCC build is pretty buggy (see e.g. #445 which is still unresolved) I tried GCC based on the alternative threading implementation by lhmouse (20170313 build) and found that this doesn't help. Deadlocks remain.
I'm not sure if this a mingw-w64
or gcc/libstdc++
or even LLD
problem.
Unfortunately, we have nothing to compare with on Windows because Visual C++
build of LLD
uses completely different code. I believe LLD
is much better tested on Linux though and I presume they have no problems with this implementation. This is why I've decided to file the bug here.
To further clarify the situation I've modified
Parallel.h
to makeVisual C++
use the same codepath as gcc. Thus buildLLD
works flawlessly with no deadlocks.Now we have: exactly the same code works when built with non-windows gcc, also it works when built with Visual C++ (I've used Visual Studio 2017 compiler), but it doesn't work (deadlocks) when built with mingw-w64 gcc (either from Msys2 distro or lhmouse distro).
Ah, sorry! I was horribly wrong regarding lhmouse build! It turned out it works just fine!
I've mistested it since wrong
libwinpthread-1.dll
was picked up (that from MSys2 build).Then both Visual C++ and lhmouse mingw-w64 threading work just fine, and only vanilla mingw-w64 threading doesn't work.
I've also isolated failing code from
llvm
codebase. Attachedpartest.cpp
is completly self-contained.Btw, thank you, lhmouse, for your great work!
?
Sorry, forgot it. Here it is.
Using gcc-6.3.0-x86_64 on MSYS2 your code always print 50075028, but not deadlocks.
Ideas?
Use
-O2
. It should deadlock (not each and every time, but pretty often).Compiled using: g++ -O2 partest.cpp -opartest -pthread
Executed as: for i in
seq 0 1000
; do ./partest.exe; doneNo deadlocks...
Last edit: niXman 2017-03-22
It doesn't seem reproducible using the following command:
x64 Windows 7, Xeon E3-1230v3, 8 logical cores.
It doesn't deadlock when built with your gcc, but deadlocks when built with vanilla MSys2 gcc, see below, @niXman confirmed this.
Gotcha. LLD doesn't build without patches in MSYS2. Let's move the discussion elsewhere.
Hm, it deadlocks for me even if built with no optimization. Did you run it sufficiently many times? What CPU are you on? I've run it on old 4-core Yorkfield, will try 4/8 Sandy.
The same result.
Executed as: for i in seq 0 1000; do ./partest.exe; done
VirtualBox, Win7-x86_64, i7-6700K
I tried Sandy, vanilla variant deadlocks either with or with no optimization enabled. I have Windows 10 1607 64-bit on both machines.
How many virtual cores is VM configured with? Anyway, I believe the cuplrit is VM usage.
1 =)
Sorry, I'll retest now...
I can catch a deadlock!
I'll try to investigate this...
We had a discussion about this problem on IRC and Kai said the second problem might not get fixed.
Even Microsoft's own condition variable suffers from this problem.
The following code effectively results in a crash on my Windows 7:
I have run into this issue just recently, so it is still a problem!
I've done some investigation and it appears to be related to the use of std::condition_variable in "lib/Support/Parallel.cpp" of LLVM. In this file, there is a use of notify_one() which is called without the lock for the mutex used in the associated wait(). This is valid code according to the specification of std::condition_variable.
However, the default MinGW libstdc++ runtine is built using POSIX threading and thus the std::condition_variable implementation makes use of the MinGW implementation of Winpthread. The notify_one() uses pthread_cond_signal() and it seems that this is where the deadlock is occurring. Whilst investigating, I did stumble across quite an old deadlock bug related to this implementation of pthread condition variable and basically the bug was closed as incorrect usage, i.e. it seems the implementation is only guaranteed to work if the "wait" lock is acquired when pthread_cond_signal() is called. I'm not sure that this is even a requirement of pthread_cond_signal(), but it certainly makes it incompatible with the implementation of std::condition_variable.
I believe that this can be worked around by just removing the "unlock" before the notify_one() in the LLVM source code. Or perhaps MinGW libstdc++ can be built using native threading, if that is supported. The best fix would be to fix the MinGW pthread implementation.
Cheers,
Andrew
Yes it has been reported here but apparently nobody has been working on it. Sorry for that.
Well as no one has looked into this, I've taken a look and I think I have a fix and some minor improvements too. I think I'll need to test it a bit longer, but so far so good.
It certainly does not deadlock in the various test cases and mostly fixes LLD. However, there are still LLD deadlocks when running the LLD lit tests and I believe that these are actually related to LLD exit causing deadlock (something else that I've had the misfortune of looking at).
If you have any ideas please send a mail to mingw-w64-public@lists.sourceforge.net. We together make the software better. Thanks in advance.