|
From: David F. <fa...@kd...> - 2012-11-06 12:41:26
|
On Monday 05 November 2012 23:19:42 Philippe Waroquiers wrote: > On Mon, 2012-11-05 at 18:59 +0100, David Faure wrote: > > The testcase http://www.davidfaure.fr/2012/qmutex_trylock.cpp > > (from https://bugs.kde.org/show_bug.cgi?id=243232) > > shows that an optimization inside Qt leads to a helgrind warning about > > wrong lock ordering, making the use of that feature impossible for > > detecting actual problems elsewhere (i.e. I have to use > > --track-lockorders=no all the time). > > > > Technically if we ignore the distinction between lock and tryLock, > > helgrind is right, we did lock the mutexes in reverse order. But because > > it's a tryLock, it couldn't possibly deadlock. > > > > Should helgrind simply ignore all pthread_mutex_trylock calls, for the > > lockorder detection, even if they succeed? I think so, actually (by > > definition they couldn't deadlock, which is what track-lockorders is all > > about). > A deadlock can appear if there is a mixture of trylock and lock on > the same lock. So, trylock cannot just be ignored. > E.g. > Thread 1: trylock mutex1 > lock mutex2 > > Thread 2: lock mutex2 > lock mutex1 > might deadlock. True. This means that only "trylock in second place" should be ignored. More on this below. > Even without mixture, isn't the below slightly bizarre/dangerous ? > Thread 1: trylock mutex1 > trylock mutex2 > > Thread 2: trylock mutex2 > trylock mutex1 No deadlock can ever happen here. > If the 2nd trylock fails, what is the "plan B" ? If the program then accesses shared data, a race condition will happen and will be detected by helgrind anyway. So ignoring the ordering of these trylocks is ok, I would think. Of course helgrind must record "we got the lock", for the race condition detection feature, but it shouldn't warn about the wrong order of the locking, since it can't possibly deadlock. Not that the above would be good programming practice, of course, but helgrind can't say anything about it if all the locks were acquired. It will warn in another run, where some trylock fails, and a race ensues. > It seems that a task must unlock all locks and restart > from "scratch" in the above case. Yes this is exactly that QOrderedMutexLocker does. Thread 1: lock mutex1 lock mutex2 Thread 2: lock mutex2 trylock mutex1 if that failed, unlock mutex2 lock mutex1 lock mutex2 If the trylock fails (because thread1 was first), then it unlocks and restarts from scratch. I can't see a deadlock risk with that, so ideally helgrind shouldn't warn. > I guess we might need an option such as: > --trylock-logic={normal-lock|local-retry|full-retry} > normal-lock = current behaviour > local-retry means the task would re-trylock > full-retry means that the plan B is to unlock all locks > and retry everything. I don't see how this can be a global option. Some piece of code (like QOrderedMutexLocker) might have the "full retry" logic above, but other pieces of code might do something different - e.g. something wrong. It doesn't make sense to me to tell helgrind "this is what all the code paths are going to do about tryLock", that's impossible to predict in a complex program. Let me present another case: Thread 1: lock mutex1 lock mutex2 Thread 2: lock mutex2 trylock mutex1 if that fails, unlock mutex2 and give up This could happen for a non-critical operation that can be canceled if it can't be done immediately. Again, no deadlock possible, so helgrind shouldn't warn about a successful trylock being "out of order". And yet this isn't a "full retry", so I don't think --trylock-logic=full-retry is the solution. Deadlock can only happen if both threads use normal locking as their second operation. A trylock as the second operation doesn't deadlock. > Or maybe we would need the same info but with an annotation > of the lock operation and/or of the lock itself ? Sounds like an annotation of the trylock operation, unless we agree on my next statement: > I am quite sure the simple logic "trylock can be ignored" > is not ok for all cases. Right. Let me refine that: "trylock can be ignored as the second operation, i.e. helgrind shouldn't issue the out-of-order-locking-warning at the precise moment it's seeing a successful out-of-order trylock." If we can agree on that, then there's no need for an annotation in the code. > > PS: see my previous email about VALGRIND_HG_ENABLE_CHECKING not working. > > Is this known? Should I report a bug? > > Always good to report a bug if you have found a bug :). > Mail will be forgotten, bug entries are less likely to be lost. OK, will do. On the other hand I'm getting more reaction about the tryLock issue here than on bug 243232 :-). Anyway, thanks for your input, much appreciated. -- David Faure, fa...@kd..., http://www.davidfaure.fr Working on KDE, in particular KDE Frameworks 5 |