|
From: <sv...@va...> - 2013-01-30 23:21:47
|
philippe 2013-01-30 23:21:34 +0000 (Wed, 30 Jan 2013)
New Revision: 13281
Log:
better handle and better document the case of multi-locks cycles
In case a lock order violation is detected in a multi lock cycle,
then the current code cannot produce the set of locks and the
stack traces involved in the cycle.
However, it is still possible to produce the stack trace of
the new lock and the other lock between which a cycle was discovered.
Also, add a comment in the code clarifying why the set of locks
establishing the required order cannot (currently) be produced.
Modified files:
trunk/helgrind/docs/hg-manual.xml
trunk/helgrind/hg_main.c
trunk/helgrind/tests/tc14_laog_dinphils.stderr.exp
Modified: trunk/helgrind/docs/hg-manual.xml (+17 -12)
===================================================================
--- trunk/helgrind/docs/hg-manual.xml 2013-01-30 23:18:11 +00:00 (rev 13280)
+++ trunk/helgrind/docs/hg-manual.xml 2013-01-30 23:21:34 +00:00 (rev 13281)
@@ -240,26 +240,31 @@
<para>When there are more than two locks in the cycle, the error is
equally serious. However, at present Helgrind does not show the locks
-involved, sometimes because it that information is not available, but
-also so as to avoid flooding you with information. For example, here
-is an example involving a cycle of five locks from a naive
-implementation the famous Dining Philosophers problem
+involved, sometimes because that information is not available, but
+also so as to avoid flooding you with information. For example, a
+naive implementation of the famous Dining Philosophers problem
+involves a cycle of five locks
(see <computeroutput>helgrind/tests/tc14_laog_dinphils.c</computeroutput>).
In this case Helgrind has detected that all 5 philosophers could
simultaneously pick up their left fork and then deadlock whilst
waiting to pick up their right forks.</para>
<programlisting><![CDATA[
-Thread #6: lock order "0x6010C0 before 0x601160" violated
+Thread #6: lock order "0x80499A0 before 0x8049A00" violated
-Observed (incorrect) order is: acquisition of lock at 0x601160
- (stack unavailable)
+Observed (incorrect) order is: acquisition of lock at 0x8049A00
+ at 0x40085BC: pthread_mutex_lock (hg_intercepts.c:495)
+ by 0x80485B4: dine (tc14_laog_dinphils.c:18)
+ by 0x400BDA4: mythread_wrapper (hg_intercepts.c:219)
+ by 0x39B924: start_thread (pthread_create.c:297)
+ by 0x2F107D: clone (clone.S:130)
- followed by a later acquisition of lock at 0x6010C0
- at 0x4C2BC62: pthread_mutex_lock (hg_intercepts.c:494)
- by 0x4007DE: dine (tc14_laog_dinphils.c:19)
- by 0x4C2CBE7: mythread_wrapper (hg_intercepts.c:219)
- by 0x4E369C9: start_thread (pthread_create.c:300)
+ followed by a later acquisition of lock at 0x80499A0
+ at 0x40085BC: pthread_mutex_lock (hg_intercepts.c:495)
+ by 0x80485CD: dine (tc14_laog_dinphils.c:19)
+ by 0x400BDA4: mythread_wrapper (hg_intercepts.c:219)
+ by 0x39B924: start_thread (pthread_create.c:297)
+ by 0x2F107D: clone (clone.S:130)
]]></programlisting>
</sect1>
Modified: trunk/helgrind/tests/tc14_laog_dinphils.stderr.exp (+4 -1)
===================================================================
--- trunk/helgrind/tests/tc14_laog_dinphils.stderr.exp 2013-01-30 23:18:11 +00:00 (rev 13280)
+++ trunk/helgrind/tests/tc14_laog_dinphils.stderr.exp 2013-01-30 23:21:34 +00:00 (rev 13281)
@@ -12,7 +12,10 @@
Thread #x: lock order "0x........ before 0x........" violated
Observed (incorrect) order is: acquisition of lock at 0x........
- (stack unavailable)
+ at 0x........: pthread_mutex_lock (hg_intercepts.c:...)
+ by 0x........: dine (tc14_laog_dinphils.c:18)
+ by 0x........: mythread_wrapper (hg_intercepts.c:...)
+ ...
followed by a later acquisition of lock at 0x........
at 0x........: pthread_mutex_lock (hg_intercepts.c:...)
Modified: trunk/helgrind/hg_main.c (+46 -1)
===================================================================
--- trunk/helgrind/hg_main.c 2013-01-30 23:18:11 +00:00 (rev 13280)
+++ trunk/helgrind/hg_main.c 2013-01-30 23:21:34 +00:00 (rev 13281)
@@ -3721,9 +3721,54 @@
found->src_ec, found->dst_ec, other->acquired_at );
} else {
/* Hmm. This can't happen (can it?) */
+ /* Yes, it can happen: see tests/tc14_laog_dinphils.
+ Imagine we have 3 philosophers A B C, and the forks
+ between them:
+
+ C
+
+ fCA fBC
+
+ A fAB B
+
+ Let's have the following actions:
+ A takes fCA,fAB
+ A releases fCA,fAB
+ B takes fAB,fBC
+ B releases fAB,fBC
+ C takes fBC,fCA
+ C releases fBC,fCA
+
+ Helgrind will report a lock order error when C takes fCA.
+ Effectively, we have a deadlock if the following
+ sequence is done:
+ A takes fCA
+ B takes fAB
+ C takes fBC
+
+ The error reported is:
+ Observed (incorrect) order fBC followed by fCA
+ but the stack traces that have established the required order
+ are not given.
+
+ This is because there is no pair (fCA, fBC) in laog exposition :
+ the laog_exposition records all pairs of locks between a new lock
+ taken by a thread and all the already taken locks.
+ So, there is no laog_exposition (fCA, fBC) as no thread ever
+ first locked fCA followed by fBC.
+
+ In other words, when the deadlock cycle involves more than
+ two locks, then helgrind does not report the sequence of
+ operations that created the cycle.
+
+ However, we can report the current stack trace (where
+ lk is being taken), and the stack trace where other was acquired:
+ Effectively, the variable 'other' contains a lock currently
+ held by this thread, with its 'acquired_at'. */
+
HG_(record_error_LockOrder)(
thr, lk->guestaddr, other->guestaddr,
- NULL, NULL, NULL );
+ NULL, NULL, other->acquired_at );
}
}
|