You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
1
(20) |
2
(19) |
3
(7) |
|
4
(13) |
5
(24) |
6
(9) |
7
(12) |
8
(8) |
9
(34) |
10
(28) |
|
11
(20) |
12
(23) |
13
(12) |
14
(10) |
15
(15) |
16
(24) |
17
(26) |
|
18
(17) |
19
(14) |
20
(14) |
21
(8) |
22
(12) |
23
(22) |
24
(10) |
|
25
(21) |
26
(21) |
27
(18) |
28
(8) |
29
(13) |
30
(15) |
|
|
From: Tom H. <th...@cy...> - 2007-11-05 03:23:35
|
Nightly build on dellow ( x86_64, Fedora 7 ) started at 2007-11-05 03:10:04 GMT Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 320 tests, 4 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) memcheck/tests/vcpu_fnfns (stdout) memcheck/tests/x86/scalar (stderr) memcheck/tests/xml1 (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 320 tests, 4 stderr failures, 3 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) memcheck/tests/vcpu_fnfns (stdout) memcheck/tests/x86/scalar (stderr) memcheck/tests/xml1 (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) none/tests/pth_cvsimple (stdout) ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Mon Nov 5 03:16:56 2007 --- new.short Mon Nov 5 03:23:36 2007 *************** *** 8,10 **** ! == 320 tests, 4 stderr failures, 3 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) --- 8,10 ---- ! == 320 tests, 4 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) *************** *** 15,17 **** none/tests/mremap2 (stdout) - none/tests/pth_cvsimple (stdout) --- 15,16 ---- |
|
From: <sv...@va...> - 2007-11-05 03:17:08
|
Author: sewardj
Date: 2007-11-05 03:17:07 +0000 (Mon, 05 Nov 2007)
New Revision: 7095
Log:
Update expected outputs for glibc25-x86.
Added:
branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stderr.exp-glibc25-x86
branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-x86
Modified:
branches/THRCHECK/thrcheck/tests/Makefile.am
branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-x86
Modified: branches/THRCHECK/thrcheck/tests/Makefile.am
===================================================================
--- branches/THRCHECK/thrcheck/tests/Makefile.am 2007-11-05 03:00:05 UTC (rev 7094)
+++ branches/THRCHECK/thrcheck/tests/Makefile.am 2007-11-05 03:17:07 UTC (rev 7095)
@@ -78,8 +78,10 @@
tc21_pthonce.stderr.exp-glibc25-x86 \
tc22_exit_w_lock.vgtest tc22_exit_w_lock.stdout.exp \
tc22_exit_w_lock.stderr.exp-glibc25-amd64 \
+ tc22_exit_w_lock.stderr.exp-glibc25-x86 \
tc23_bogus_condwait.vgtest tc23_bogus_condwait.stdout.exp \
- tc23_bogus_condwait.stderr.exp-glibc25-amd64
+ tc23_bogus_condwait.stderr.exp-glibc25-amd64 \
+ tc23_bogus_condwait.stderr.exp-glibc25-x86
check_PROGRAMS = \
hg01_all_ok \
Modified: branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-x86
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-x86 2007-11-05 03:00:05 UTC (rev 7094)
+++ branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-x86 2007-11-05 03:17:07 UTC (rev 7095)
@@ -68,12 +68,9 @@
---------------- pthread_cond_wait et al ----------------
-Thread #1 unlocked a not-locked lock at 0x........
+Thread #1: pthread_cond_{timed}wait called with un-held mutex
at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:147)
- Lock at 0x........ was first observed
- at 0x........: pthread_mutex_init (tc_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:145)
Thread #1's call to pthread_cond_wait failed
with error code 1 (EPERM: Operation not permitted)
@@ -86,6 +83,10 @@
FIXME: can't figure out how to verify wrap of pthread_broadcast_signal
+Thread #1: pthread_cond_{timed}wait called with un-held mutex
+ at 0x........: pthread_cond_timedwait@* (tc_intercepts.c:...)
+ by 0x........: main (tc20_verifywrap.c:165)
+
Thread #1's call to pthread_cond_timedwait failed
with error code 22 (EINVAL: Invalid argument)
at 0x........: pthread_cond_timedwait@* (tc_intercepts.c:...)
@@ -146,15 +147,9 @@
Thread #1 deallocated location 0x........ containing a locked lock
- at 0x........: main (tc20_verifywrap.c:262)
+ at 0x........: main (tc20_verifywrap.c:261)
Lock at 0x........ was first observed
at 0x........: pthread_rwlock_init (tc_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:216)
-Thread #1 deallocated location 0x........ containing a locked lock
- at 0x........: main (tc20_verifywrap.c:262)
- Lock at 0x........ was first observed
- at 0x........: pthread_mutex_init (tc_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:145)
-
ERROR SUMMARY: 20 errors from 20 contexts (suppressed: 0 from 0)
Added: branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stderr.exp-glibc25-x86
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stderr.exp-glibc25-x86 (rev 0)
+++ branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stderr.exp-glibc25-x86 2007-11-05 03:17:07 UTC (rev 7095)
@@ -0,0 +1,46 @@
+
+Thread #2 was created
+ at 0x........: clone (in /...libc...)
+ by 0x........: pthread_create@GLIBC_ (in /lib/libpthread...)
+ by 0x........: pthread_create@* (tc_intercepts.c:...)
+ by 0x........: main (tc22_exit_w_lock.c:39)
+
+Thread #2: Exiting thread still holds 2 locks
+ at 0x........: start_thread (in /lib/libpthread...)
+ by 0x........: ...
+
+Thread #1 is the program's root thread
+
+Possible data race during write of size 4 at 0x........
+ at 0x........: mempcpy (in /lib/ld-2.5.so)
+ by 0x........: pthread_create@GLIBC_ (in /lib/libpthread...)
+ by 0x........: pthread_create@* (tc_intercepts.c:...)
+ by 0x........: main (tc22_exit_w_lock.c:42)
+ Old state: owned exclusively by thread #2
+ New state: shared-modified by threads #1, #2
+ Reason: this thread, #1, holds no locks at all
+
+Possible data race during write of size 4 at 0x........
+ at 0x........: memset (in /lib/ld-2.5.so)
+ by 0x........: pthread_create@GLIBC_ (in /lib/libpthread...)
+ by 0x........: pthread_create@* (tc_intercepts.c:...)
+ by 0x........: main (tc22_exit_w_lock.c:42)
+ Old state: owned exclusively by thread #2
+ New state: shared-modified by threads #1, #2
+ Reason: this thread, #1, holds no locks at all
+
+Thread #3 was created
+ at 0x........: clone (in /...libc...)
+ by 0x........: pthread_create@GLIBC_ (in /lib/libpthread...)
+ by 0x........: pthread_create@* (tc_intercepts.c:...)
+ by 0x........: main (tc22_exit_w_lock.c:42)
+
+Thread #3: Exiting thread still holds 1 lock
+ at 0x........: start_thread (in /lib/libpthread...)
+ by 0x........: ...
+
+Thread #1: Exiting thread still holds 1 lock
+ at 0x........: kill (in /...libc...)
+ by 0x........: (below main) (in /...libc...)
+
+ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 0 from 0)
Added: branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-x86
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-x86 (rev 0)
+++ branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-x86 2007-11-05 03:17:07 UTC (rev 7095)
@@ -0,0 +1,30 @@
+
+Thread #1 is the program's root thread
+
+Thread #1: pthread_cond_{timed}wait called with invalid mutex
+ at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:62)
+
+Thread #1: pthread_cond_{timed}wait called with un-held mutex
+ at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:65)
+
+Thread #1: pthread_cond_{timed}wait called with mutex of type pthread_rwlock_t*
+ at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:68)
+
+Thread #3 was created
+ at 0x........: clone (in /...libc...)
+ by 0x........: pthread_create@GLIBC_ (in /lib/libpthread...)
+ by 0x........: pthread_create@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:71)
+
+Thread #3: Exiting thread still holds 1 lock
+ at 0x........: start_thread (in /lib/libpthread...)
+ by 0x........: ...
+
+Thread #1: pthread_cond_{timed}wait called with mutex held by a different thread
+ at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:73)
+
+ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 0 from 0)
|
|
From: Tom H. <th...@cy...> - 2007-11-05 03:11:03
|
Nightly build on gill ( x86_64, Fedora Core 2 ) started at 2007-11-05 03:00:02 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 322 tests, 6 stderr failures, 1 stdout failure, 0 post failures == memcheck/tests/pointer-trace (stderr) memcheck/tests/stack_switch (stderr) memcheck/tests/x86/scalar (stderr) memcheck/tests/x86/scalar_supp (stderr) none/tests/fdleak_fcntl (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) |
|
From: <sv...@va...> - 2007-11-05 03:00:04
|
Author: sewardj
Date: 2007-11-05 03:00:05 +0000 (Mon, 05 Nov 2007)
New Revision: 7094
Log:
More expected-output wibbling. Sigh.
Modified:
branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-amd64
Modified: branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-amd64
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-amd64 2007-11-05 02:46:08 UTC (rev 7093)
+++ branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-amd64 2007-11-05 03:00:05 UTC (rev 7094)
@@ -148,7 +148,7 @@
Thread #1 deallocated location 0x........ containing a locked lock
- at 0x........: main (tc20_verifywrap.c:263)
+ at 0x........: main (tc20_verifywrap.c:262)
Lock at 0x........ was first observed
at 0x........: pthread_rwlock_init (tc_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:216)
|
|
From: <sv...@va...> - 2007-11-05 02:46:07
|
Author: sewardj
Date: 2007-11-05 02:46:08 +0000 (Mon, 05 Nov 2007)
New Revision: 7093
Log:
More output changes and expected-output changes pertaining to mutex
checking for pthread_cond_wait.
Modified:
branches/THRCHECK/thrcheck/tc_main.c
branches/THRCHECK/thrcheck/tests/tc20_verifywrap.c
branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-amd64
branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-amd64
Modified: branches/THRCHECK/thrcheck/tc_main.c
===================================================================
--- branches/THRCHECK/thrcheck/tc_main.c 2007-11-05 02:26:31 UTC (rev 7092)
+++ branches/THRCHECK/thrcheck/tc_main.c 2007-11-05 02:46:08 UTC (rev 7093)
@@ -6058,25 +6058,25 @@
lk_valid = False;
record_error_Misc(
thr,
- "pthread_cond_wait called with invalid mutex" );
+ "pthread_cond_{timed}wait called with invalid mutex" );
} else {
tl_assert( is_sane_LockN(lk) );
if (lk->kind == LK_rdwr) {
lk_valid = False;
record_error_Misc(
- thr, "pthread_cond_wait called with mutex "
+ thr, "pthread_cond_{timed}wait called with mutex "
"of type pthread_rwlock_t*" );
} else
if (lk->heldBy == NULL) {
lk_valid = False;
record_error_Misc(
- thr, "pthread_cond_wait called with un-held mutex");
+ thr, "pthread_cond_{timed}wait called with un-held mutex");
} else
if (lk->heldBy != NULL
&& TC_(elemBag)( lk->heldBy, (Word)thr ) == 0) {
lk_valid = False;
record_error_Misc(
- thr, "pthread_cond_wait called with mutex "
+ thr, "pthread_cond_{timed}wait called with mutex "
"held by a different thread" );
}
}
Modified: branches/THRCHECK/thrcheck/tests/tc20_verifywrap.c
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc20_verifywrap.c 2007-11-05 02:26:31 UTC (rev 7092)
+++ branches/THRCHECK/thrcheck/tests/tc20_verifywrap.c 2007-11-05 02:46:08 UTC (rev 7093)
@@ -255,7 +255,6 @@
/* At this point it should complain about deallocation
of memory containing locked locks:
- mx4
rwl3
*/
Modified: branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-amd64
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-amd64 2007-11-05 02:26:31 UTC (rev 7092)
+++ branches/THRCHECK/thrcheck/tests/tc20_verifywrap.stderr.exp-glibc25-amd64 2007-11-05 02:46:08 UTC (rev 7093)
@@ -69,12 +69,9 @@
---------------- pthread_cond_wait et al ----------------
-Thread #1 unlocked a not-locked lock at 0x........
+Thread #1: pthread_cond_{timed}wait called with un-held mutex
at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:147)
- Lock at 0x........ was first observed
- at 0x........: pthread_mutex_init (tc_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:145)
Thread #1's call to pthread_cond_wait failed
with error code 1 (EPERM: Operation not permitted)
@@ -87,6 +84,10 @@
FIXME: can't figure out how to verify wrap of pthread_broadcast_signal
+Thread #1: pthread_cond_{timed}wait called with un-held mutex
+ at 0x........: pthread_cond_timedwait@* (tc_intercepts.c:...)
+ by 0x........: main (tc20_verifywrap.c:165)
+
Thread #1's call to pthread_cond_timedwait failed
with error code 22 (EINVAL: Invalid argument)
at 0x........: pthread_cond_timedwait@* (tc_intercepts.c:...)
@@ -152,10 +153,4 @@
at 0x........: pthread_rwlock_init (tc_intercepts.c:...)
by 0x........: main (tc20_verifywrap.c:216)
-Thread #1 deallocated location 0x........ containing a locked lock
- at 0x........: main (tc20_verifywrap.c:263)
- Lock at 0x........ was first observed
- at 0x........: pthread_mutex_init (tc_intercepts.c:...)
- by 0x........: main (tc20_verifywrap.c:145)
-
ERROR SUMMARY: 20 errors from 20 contexts (suppressed: 0 from 0)
Modified: branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-amd64
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-amd64 2007-11-05 02:26:31 UTC (rev 7092)
+++ branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-amd64 2007-11-05 02:46:08 UTC (rev 7093)
@@ -1,15 +1,15 @@
Thread #1 is the program's root thread
-Thread #1: pthread_cond_wait called with invalid mutex
+Thread #1: pthread_cond_{timed}wait called with invalid mutex
at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
by 0x........: main (tc23_bogus_condwait.c:62)
-Thread #1: pthread_cond_wait called with un-held mutex
+Thread #1: pthread_cond_{timed}wait called with un-held mutex
at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
by 0x........: main (tc23_bogus_condwait.c:65)
-Thread #1: pthread_cond_wait called with mutex of type pthread_rwlock_t*
+Thread #1: pthread_cond_{timed}wait called with mutex of type pthread_rwlock_t*
at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
by 0x........: main (tc23_bogus_condwait.c:68)
@@ -24,7 +24,7 @@
at 0x........: start_thread (in /lib/libpthread...)
by 0x........: ...
-Thread #1: pthread_cond_wait called with mutex held by a different thread
+Thread #1: pthread_cond_{timed}wait called with mutex held by a different thread
at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
by 0x........: main (tc23_bogus_condwait.c:73)
|
|
From: <sv...@va...> - 2007-11-05 02:26:29
|
Author: sewardj
Date: 2007-11-05 02:26:31 +0000 (Mon, 05 Nov 2007)
New Revision: 7092
Log:
Add a test for detection of passing bogus mutex values to
pthread_cond_wait.
Added:
branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.c
branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-amd64
branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stdout.exp
branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.vgtest
Modified:
branches/THRCHECK/thrcheck/tests/Makefile.am
Modified: branches/THRCHECK/thrcheck/tests/Makefile.am
===================================================================
--- branches/THRCHECK/thrcheck/tests/Makefile.am 2007-11-05 02:11:57 UTC (rev 7091)
+++ branches/THRCHECK/thrcheck/tests/Makefile.am 2007-11-05 02:26:31 UTC (rev 7092)
@@ -77,7 +77,9 @@
tc21_pthonce.stderr.exp-glibc25-amd64 \
tc21_pthonce.stderr.exp-glibc25-x86 \
tc22_exit_w_lock.vgtest tc22_exit_w_lock.stdout.exp \
- tc22_exit_w_lock.stderr.exp-glibc25-amd64
+ tc22_exit_w_lock.stderr.exp-glibc25-amd64 \
+ tc23_bogus_condwait.vgtest tc23_bogus_condwait.stdout.exp \
+ tc23_bogus_condwait.stderr.exp-glibc25-amd64
check_PROGRAMS = \
hg01_all_ok \
@@ -107,7 +109,8 @@
tc19_shadowmem \
tc20_verifywrap \
tc21_pthonce \
- tc22_exit_w_lock
+ tc22_exit_w_lock \
+ tc23_bogus_condwait
AM_CPPFLAGS = -I$(top_srcdir) -I$(top_srcdir)/include \
-I$(top_srcdir)/coregrind -I$(top_builddir)/include \
Added: branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.c
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.c (rev 0)
+++ branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.c 2007-11-05 02:26:31 UTC (rev 7092)
@@ -0,0 +1,78 @@
+
+/* Expect 5 errors total (4 re cvs, 1 re exiting w/lock.).
+ Tests passing bogus mutexes to pthread_cond_wait. */
+
+#include <pthread.h>
+#include <assert.h>
+#include <unistd.h>
+
+pthread_mutex_t mx[4];
+pthread_cond_t cv;
+pthread_rwlock_t rwl;
+
+void* rescue_me ( void* uu )
+{
+ /* wait for, and unblock, the first wait */
+ sleep(1);
+ pthread_cond_signal( &cv );
+
+ /* wait for, and unblock, the second wait */
+ sleep(1);
+ pthread_cond_signal( &cv );
+
+ /* wait for, and unblock, the third wait */
+ sleep(1);
+ pthread_cond_signal( &cv );
+
+ /* wait for grabber and main, then unblock the fourth wait */
+ sleep(1+1);
+ pthread_cond_signal( &cv );
+
+ return NULL;
+}
+
+void* grab_the_lock ( void* uu )
+{
+ int r= pthread_mutex_lock( &mx[2] ); assert(!r);
+ /* tc correctly complains that the thread is exiting whilst still
+ holding a lock. A bit tricky to fix - we just live with it. */
+ return NULL;
+}
+
+int main ( void )
+{
+ int r;
+ pthread_t my_rescuer, grabber;
+
+ r= pthread_mutex_init(&mx[0], NULL); assert(!r);
+ r= pthread_mutex_init(&mx[1], NULL); assert(!r);
+ r= pthread_mutex_init(&mx[2], NULL); assert(!r);
+ r= pthread_mutex_init(&mx[3], NULL); assert(!r);
+
+ r= pthread_cond_init(&cv, NULL); assert(!r);
+ r= pthread_rwlock_init(&rwl, NULL); assert(!r);
+
+ r= pthread_create( &my_rescuer, NULL, rescue_me, NULL );
+ assert(!r);
+
+ /* Do stupid things and hope that rescue_me gets us out of
+ trouble */
+
+ /* mx is bogus */
+ r= pthread_cond_wait(&cv, (pthread_mutex_t*)(1 + (char*)&mx[0]) );
+
+ /* mx is not locked */
+ r= pthread_cond_wait(&cv, &mx[0]);
+
+ /* wrong flavour of lock */
+ r= pthread_cond_wait(&cv, (pthread_mutex_t*)&rwl );
+
+ /* mx is held by someone else. */
+ r= pthread_create( &grabber, NULL, grab_the_lock, NULL ); assert(!r);
+ sleep(1); /* let the grabber get there first */
+ r= pthread_cond_wait(&cv, &mx[2] );
+
+ r= pthread_join( my_rescuer, NULL ); assert(!r);
+ r= pthread_join( grabber, NULL ); assert(!r);
+ return 0;
+}
Added: branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-amd64
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-amd64 (rev 0)
+++ branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stderr.exp-glibc25-amd64 2007-11-05 02:26:31 UTC (rev 7092)
@@ -0,0 +1,31 @@
+
+Thread #1 is the program's root thread
+
+Thread #1: pthread_cond_wait called with invalid mutex
+ at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:62)
+
+Thread #1: pthread_cond_wait called with un-held mutex
+ at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:65)
+
+Thread #1: pthread_cond_wait called with mutex of type pthread_rwlock_t*
+ at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:68)
+
+Thread #3 was created
+ at 0x........: clone (in /...libc...)
+ by 0x........: ...
+ by 0x........: pthread_create@GLIBC_ (in /lib/libpthread...)
+ by 0x........: pthread_create@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:71)
+
+Thread #3: Exiting thread still holds 1 lock
+ at 0x........: start_thread (in /lib/libpthread...)
+ by 0x........: ...
+
+Thread #1: pthread_cond_wait called with mutex held by a different thread
+ at 0x........: pthread_cond_wait@* (tc_intercepts.c:...)
+ by 0x........: main (tc23_bogus_condwait.c:73)
+
+ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 0 from 0)
Added: branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.stdout.exp
===================================================================
Added: branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.vgtest
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.vgtest (rev 0)
+++ branches/THRCHECK/thrcheck/tests/tc23_bogus_condwait.vgtest 2007-11-05 02:26:31 UTC (rev 7092)
@@ -0,0 +1 @@
+prog: tc23_bogus_condwait
|
|
From: Julian S. <js...@ac...> - 2007-11-05 02:24:05
|
> The last line is the problem, it shouldn't look like that. I'm using the > address 0 specially to refer to all the allocation functions. Could > VG_(get_StackTrace)() be returning 0 as an address at the bottom of the > stack trace? Well, it's plausible. The PowerPC ELF ABI specification says that "The back chain word of the first stack frame contains a null pointer (0)." So whether a zero back chain word can somehow give rise to a zero return address, I dunno. Maybe. (abi_elf_ppc64_1.7.pdf page 29). Presumably the ppc32 ELF ABI is similar, at least in this aspect. > If so, I'll have to find a different way to encode the > allocation functions... maybe "(Addr)(-1L)". Is it possible/easy to encode them some way that doesn't involve any magic Addr values? J |
|
From: Nicholas N. <nj...@cs...> - 2007-11-05 02:15:34
|
On Sun, 4 Nov 2007, Julian Seward wrote:
>> But the result for deep-D looks strange. Can you send me the massif.out
>> file for it? Thanks.
>
> Attached.
This snapshot is strange:
#-----------
snapshot=9
#-----------
time=972
mem_heap_B=900
mem_heap_admin_B=72
mem_stacks_B=0
heap_tree=detailed
n1: 900 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n1: 900 0xFE6737B: (within /lib/libc-2.6.1.so)
n1: 900 0xFE6759F: (below main) (in /lib/libc-2.6.1.so)
n0: 900 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
The last line is the problem, it shouldn't look like that. I'm using the
address 0 specially to refer to all the allocation functions. Could
VG_(get_StackTrace)() be returning 0 as an address at the bottom of the
stack trace? If so, I'll have to find a different way to encode the
allocation functions... maybe "(Addr)(-1L)".
Nick
|
|
From: <sv...@va...> - 2007-11-05 02:11:58
|
Author: sewardj
Date: 2007-11-05 02:11:57 +0000 (Mon, 05 Nov 2007)
New Revision: 7091
Log:
Add machinery to check for bogus mutex arguments to pthread_cond_wait,
since the documentation claims that functionality is provided :-)
Modified:
branches/THRCHECK/thrcheck/tc_intercepts.c
branches/THRCHECK/thrcheck/tc_main.c
Modified: branches/THRCHECK/thrcheck/tc_intercepts.c
===================================================================
--- branches/THRCHECK/thrcheck/tc_intercepts.c 2007-11-05 02:10:33 UTC (rev 7090)
+++ branches/THRCHECK/thrcheck/tc_intercepts.c 2007-11-05 02:11:57 UTC (rev 7091)
@@ -86,6 +86,19 @@
_arg1,_arg2,0,0,0); \
} while (0)
+#define DO_CREQ_W_WW(_resF, _creqF, _ty1F,_arg1F, _ty2F,_arg2F) \
+ do { \
+ Word _res, _arg1, _arg2; \
+ assert(sizeof(_ty1F) == sizeof(Word)); \
+ assert(sizeof(_ty2F) == sizeof(Word)); \
+ _arg1 = (Word)(_arg1F); \
+ _arg2 = (Word)(_arg2F); \
+ VALGRIND_DO_CLIENT_REQUEST(_res, 2, \
+ (_creqF), \
+ _arg1,_arg2,0,0,0); \
+ _resF = _res; \
+ } while (0)
+
#define DO_CREQ_v_WWW(_creqF, _ty1F,_arg1F, \
_ty2F,_arg2F, _ty3F, _arg3F) \
do { \
@@ -525,6 +538,8 @@
{
int ret;
OrigFn fn;
+ unsigned long mutex_is_valid;
+
VALGRIND_GET_ORIG_FN(fn);
if (TRACE_PTH_FNS) {
@@ -532,26 +547,39 @@
fflush(stderr);
}
+ /* Tell the tool a cond-wait is about to happen, so it can check
+ for bogus argument values. In return it tells us whether it
+ thinks the mutex is valid or not. */
+ DO_CREQ_W_WW(mutex_is_valid,
+ _VG_USERREQ__TC_PTHREAD_COND_WAIT_PRE,
+ pthread_cond_t*,cond, pthread_mutex_t*,mutex);
+ assert(mutex_is_valid == 1 || mutex_is_valid == 0);
+
/* Tell the tool we're about to drop the mutex. This reflects the
fact that in a cond_wait, we show up holding the mutex, and the
call atomically drops the mutex and waits for the cv to be
signalled. */
- DO_CREQ_v_W(_VG_USERREQ__TC_PTHREAD_MUTEX_UNLOCK_PRE,
- pthread_mutex_t*,mutex);
+ if (mutex_is_valid) {
+ DO_CREQ_v_W(_VG_USERREQ__TC_PTHREAD_MUTEX_UNLOCK_PRE,
+ pthread_mutex_t*,mutex);
+ }
CALL_FN_W_WW(ret, fn, cond,mutex);
- /* And now we have the mutex again, regardless of the error code
- returned. */
- // FIXME: but only if we actually had it before the call
- DO_CREQ_v_W(_VG_USERREQ__TC_PTHREAD_MUTEX_LOCK_POST,
- pthread_mutex_t*,mutex);
+ /* these conditionals look stupid, but compare w/ same logic for
+ pthread_cond_timedwait below */
+ if (ret == 0 && mutex_is_valid) {
+ /* and now we have the mutex again */
+ DO_CREQ_v_W(_VG_USERREQ__TC_PTHREAD_MUTEX_LOCK_POST,
+ pthread_mutex_t*,mutex);
+ }
- if (ret == 0) {
+ if (ret == 0 && mutex_is_valid) {
DO_CREQ_v_WW(_VG_USERREQ__TC_PTHREAD_COND_WAIT_POST,
pthread_cond_t*,cond, pthread_mutex_t*,mutex);
+ }
- } else {
+ if (ret != 0) {
DO_PthAPIerror( "pthread_cond_wait", ret );
}
@@ -570,6 +598,7 @@
{
int ret;
OrigFn fn;
+ unsigned long mutex_is_valid;
VALGRIND_GET_ORIG_FN(fn);
if (TRACE_PTH_FNS) {
@@ -578,29 +607,38 @@
fflush(stderr);
}
+ /* Tell the tool a cond-wait is about to happen, so it can check
+ for bogus argument values. In return it tells us whether it
+ thinks the mutex is valid or not. */
+ DO_CREQ_W_WW(mutex_is_valid,
+ _VG_USERREQ__TC_PTHREAD_COND_WAIT_PRE,
+ pthread_cond_t*,cond, pthread_mutex_t*,mutex);
+ assert(mutex_is_valid == 1 || mutex_is_valid == 0);
+
/* Tell the tool we're about to drop the mutex. This reflects the
fact that in a cond_wait, we show up holding the mutex, and the
call atomically drops the mutex and waits for the cv to be
signalled. */
- DO_CREQ_v_W(_VG_USERREQ__TC_PTHREAD_MUTEX_UNLOCK_PRE,
- pthread_mutex_t*,mutex);
+ if (mutex_is_valid) {
+ DO_CREQ_v_W(_VG_USERREQ__TC_PTHREAD_MUTEX_UNLOCK_PRE,
+ pthread_mutex_t*,mutex);
+ }
CALL_FN_W_WWW(ret, fn, cond,mutex,abstime);
- /* And now we have the mutex again, regardless of the error code
- returned. In particular we still have it even if
- ret==ETIMEDOUT. */
- // FIXME: but only if we actually had it before the call
- DO_CREQ_v_W(_VG_USERREQ__TC_PTHREAD_MUTEX_LOCK_POST,
- pthread_mutex_t*,mutex);
+ if ((ret == 0 || ret == ETIMEDOUT) && mutex_is_valid) {
+ /* and now we have the mutex again */
+ DO_CREQ_v_W(_VG_USERREQ__TC_PTHREAD_MUTEX_LOCK_POST,
+ pthread_mutex_t*,mutex);
+ }
- if (ret == 0) {
+ if (ret == 0 && mutex_is_valid) {
DO_CREQ_v_WW(_VG_USERREQ__TC_PTHREAD_COND_WAIT_POST,
pthread_cond_t*,cond, pthread_mutex_t*,mutex);
+ }
- } else {
- if (ret != ETIMEDOUT)
- DO_PthAPIerror( "pthread_cond_timedwait", ret );
+ if (ret != 0 && ret != ETIMEDOUT) {
+ DO_PthAPIerror( "pthread_cond_timedwait", ret );
}
if (TRACE_PTH_FNS) {
Modified: branches/THRCHECK/thrcheck/tc_main.c
===================================================================
--- branches/THRCHECK/thrcheck/tc_main.c 2007-11-05 02:10:33 UTC (rev 7090)
+++ branches/THRCHECK/thrcheck/tc_main.c 2007-11-05 02:11:57 UTC (rev 7091)
@@ -5918,7 +5918,7 @@
&& TC_(elemBag)( lk->heldBy, (Word)thr ) > 0 ) {
/* uh, it's a non-recursive lock and we already w-hold it, and
this is a real lock operation (not a speculative "tryLock"
- kind of thing. Duh. Deadlock coming up; but at least
+ kind of thing). Duh. Deadlock coming up; but at least
produce an error message. */
record_error_Misc( thr, "Attempt to re-lock a "
"non-recursive lock I already hold" );
@@ -6031,10 +6031,15 @@
}
}
-static void evh__TC_PTHREAD_COND_WAIT_PRE ( ThreadId tid,
+/* returns True if it reckons 'mutex' is valid and held by this
+ thread, else False */
+static Bool evh__TC_PTHREAD_COND_WAIT_PRE ( ThreadId tid,
void* cond, void* mutex )
{
Thread* thr;
+ Lock* lk;
+ Bool lk_valid = True;
+
if (SHOW_EVENTS >= 1)
VG_(printf)("evh__tc_PTHREAD_COND_WAIT_PRE"
"(ctid=%d, cond=%p, mutex=%p)\n",
@@ -6044,7 +6049,41 @@
thr = map_threads_maybe_lookup( tid );
tl_assert(thr); /* cannot fail - Thread* must already exist */
+ lk = map_locks_maybe_lookup( (Addr)mutex );
+
+ /* Check for stupid mutex arguments. There are various ways to be
+ a bozo. Only complain once, though, even if more than one thing
+ is wrong. */
+ if (lk == NULL) {
+ lk_valid = False;
+ record_error_Misc(
+ thr,
+ "pthread_cond_wait called with invalid mutex" );
+ } else {
+ tl_assert( is_sane_LockN(lk) );
+ if (lk->kind == LK_rdwr) {
+ lk_valid = False;
+ record_error_Misc(
+ thr, "pthread_cond_wait called with mutex "
+ "of type pthread_rwlock_t*" );
+ } else
+ if (lk->heldBy == NULL) {
+ lk_valid = False;
+ record_error_Misc(
+ thr, "pthread_cond_wait called with un-held mutex");
+ } else
+ if (lk->heldBy != NULL
+ && TC_(elemBag)( lk->heldBy, (Word)thr ) == 0) {
+ lk_valid = False;
+ record_error_Misc(
+ thr, "pthread_cond_wait called with mutex "
+ "held by a different thread" );
+ }
+ }
+
// error-if: cond is also associated with a different mutex
+
+ return lk_valid;
}
static void evh__TC_PTHREAD_COND_WAIT_POST ( ThreadId tid,
@@ -7477,17 +7516,22 @@
evh__TC_PTHREAD_COND_SIGNAL_PRE( tid, (void*)args[1] );
break;
- /* Entry into pthread_cond_wait, cond=arg[1], mutex=arg[2] */
- case _VG_USERREQ__TC_PTHREAD_COND_WAIT_PRE:
- evh__TC_PTHREAD_COND_WAIT_PRE( tid,
- (void*)args[1], (void*)args[2] );
+ /* Entry into pthread_cond_wait, cond=arg[1], mutex=arg[2].
+ Returns a flag indicating whether or not the mutex is believed to be
+ valid for this operation. */
+ case _VG_USERREQ__TC_PTHREAD_COND_WAIT_PRE: {
+ Bool mutex_is_valid
+ = evh__TC_PTHREAD_COND_WAIT_PRE( tid, (void*)args[1],
+ (void*)args[2] );
+ *ret = mutex_is_valid ? 1 : 0;
break;
+ }
/* Thread successfully completed pthread_cond_wait, cond=arg[1],
mutex=arg[2] */
case _VG_USERREQ__TC_PTHREAD_COND_WAIT_POST:
evh__TC_PTHREAD_COND_WAIT_POST( tid,
- (void*)args[1], (void*)args[2] );
+ (void*)args[1], (void*)args[2] );
break;
case _VG_USERREQ__TC_PTHREAD_RWLOCK_INIT_POST:
@@ -8371,7 +8415,7 @@
clo_happens_before = 0;
else if (VG_CLO_STREQ(arg, "--happens-before=threads"))
clo_happens_before = 1;
- else if (VG_CLO_STREQ(arg, "--happens-before=condvars"))
+ else if (VG_CLO_STREQ(arg, "--happens-before=all"))
clo_happens_before = 2;
else if (VG_CLO_STREQ(arg, "--gen-vcg=no"))
@@ -8424,8 +8468,8 @@
static void tc_print_usage ( void )
{
VG_(printf)(
-" --happens-before=none|threads|condvars [condvars] consider no events,\n"
-" thread create/join, thread create/join/cvsignal/cvwait as sync points\n"
+" --happens-before=none|threads|all [all] consider no events, thread\n"
+" create/join, create/join/cvsignal/cvwait/semwait/post as sync points\n"
" --trace-addr=0xXXYYZZ show all state changes for address 0xXXYYZZ\n"
" --trace-level=0|1|2 verbosity level of --trace-addr [1]\n"
);
|
|
From: <sv...@va...> - 2007-11-05 02:10:32
|
Author: sewardj
Date: 2007-11-05 02:10:33 +0000 (Mon, 05 Nov 2007)
New Revision: 7090
Log:
Last minute mods the the manual.
Modified:
branches/THRCHECK/thrcheck/docs/tc-manual.xml
Modified: branches/THRCHECK/thrcheck/docs/tc-manual.xml
===================================================================
--- branches/THRCHECK/thrcheck/docs/tc-manual.xml 2007-11-05 00:25:53 UTC (rev 7089)
+++ branches/THRCHECK/thrcheck/docs/tc-manual.xml 2007-11-05 02:10:33 UTC (rev 7090)
@@ -327,7 +327,7 @@
</itemizedlist>
<para>Understanding the memory state machine is central to
-understanding Thrcheck's race-detection algorithm. The next two
+understanding Thrcheck's race-detection algorithm. The next three
subsections explain this.</para>
</sect2>
@@ -1227,6 +1227,14 @@
and <computeroutput>http://lkml.org/lkml/2007/10/24/673</computeroutput>.
</para>
</listitem>
+ <listitem><para>Don't update the lock-order graph, and don't check
+ for errors, when a "try"-style lock operation happens (eg
+ pthread_mutex_trylock). Such calls do not add any real
+ restrictions to the locking order, since they can always fail to
+ acquire the lock, resulting in the caller going off and doing Plan
+ B (presumably it will have a Plan B). Doing such checks could
+ generate false lock-order errors and confuse users.</para>
+ </listitem>
</itemizedlist>
|
|
From: <js...@ac...> - 2007-11-05 01:17:08
|
Nightly build on g5 ( SuSE 10.1, ppc970 ) started at 2007-11-05 02:00:01 CET Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 255 tests, 11 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/deep_templates (stdout) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/pointer-trace (stderr) massif/tests/culling1 (stderr) massif/tests/culling2 (stderr) massif/tests/deep-C (stderr) massif/tests/peak2 (stderr) massif/tests/realloc (stderr) none/tests/faultstatus (stderr) none/tests/fdleak_cmsg (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) |
|
From: <sv...@va...> - 2007-11-05 00:25:52
|
Author: sewardj
Date: 2007-11-05 00:25:53 +0000 (Mon, 05 Nov 2007)
New Revision: 7089
Log:
Finish first version of the manual. Gaaaah!
Modified:
branches/THRCHECK/thrcheck/docs/tc-manual.xml
Modified: branches/THRCHECK/thrcheck/docs/tc-manual.xml
===================================================================
--- branches/THRCHECK/thrcheck/docs/tc-manual.xml 2007-11-04 17:11:19 UTC (rev 7088)
+++ branches/THRCHECK/thrcheck/docs/tc-manual.xml 2007-11-05 00:25:53 UTC (rev 7089)
@@ -38,26 +38,35 @@
<orderedlist>
<listitem>
- <para>Section FIXME: Misuses of the POSIX pthreads API.</para>
+ <para><link linkend="tc-manual.api-checks">
+ Misuses of the POSIX pthreads API.</link></para>
</listitem>
<listitem>
- <para>Section FIXME: Potential deadlocks arising from lock ordering
- problems.</para>
+ <para><link linkend="tc-manual.lock-orders">
+ Potential deadlocks arising from lock
+ ordering problems.</link></para>
</listitem>
<listitem>
- <para>Section FIXME: Data races -- accessing memory without adequate
- locking.</para>
+ <para><link linkend="tc-manual.data-races">
+ Data races -- accessing memory without adequate locking.
+ </link></para>
</listitem>
</orderedlist>
-<para>Section FIXME contains guidance on how to get the best out of Thrcheck.
-This is really a discussion about debugging strategies and how to
-organise your program to enhance its verifiability.</para>
+<para>Following those is a section containing
+<link linkend="tc-manual.effective-use">
+hints and tips on how to get the best out of Thrcheck.</link>
+</para>
-<para>Section FIXME contains a summary of command-line options.</para>
+<para>Then there is a
+<link linkend="tc-manual.options">summary of command-line
+options.</link>
+</para>
-<para>Finally, Section FIXME contains a brief list of areas in which Thrcheck
-could be improved.</para>
+<para>Finally, there is
+<link linkend="tc-manual.todolist">a brief summary of areas in which Thrcheck
+could be improved.</link>
+</para>
</sect1>
@@ -71,7 +80,7 @@
is therefore able to report on various common problems. Although
these are unglamourous errors, their presence can lead to undefined
program behaviour and hard-to-find bugs later in execution. The
-detected errors include:</para>
+detected errors are:</para>
<itemizedlist>
<listitem><para>unlocking an invalid mutex</para></listitem>
@@ -87,11 +96,20 @@
versa</para></listitem>
<listitem><para>when a POSIX pthread function fails with an
error code that must be handled</para></listitem>
+ <listitem><para>when a thread exits whilst still holding locked
+ locks</para></listitem>
+ <listitem><para>calling <computeroutput>pthread_cond_wait</computeroutput>
+ with a not-locked mutex, or one locked by a different
+ thread</para></listitem>
</itemizedlist>
<para>Checks pertaining to the validity of mutexes are generally also
performed for reader-writer locks.</para>
+<para>Various kinds of this-can't-possibly-happen events are also
+reported. These usually indicate bugs in the system threading
+library.</para>
+
<para>Reported errors always contain a primary stack trace indicating
where the error was detected. They may also contain auxiliary stack
traces giving additional information. In particular, most errors
@@ -114,8 +132,9 @@
<para>Thrcheck has a way of summarising thread identities, as
evidenced here by the text "<computeroutput>Thread
#1</computeroutput>". This is so that it can speak about threads and
-sets of threads without overwhelming you with details. See FIXME
-below for details.</para>
+sets of threads without overwhelming you with details. See
+<link linkend="tc-manual.data-races.errmsgs">below</link>
+for more information on interpreting error messages.</para>
</sect1>
@@ -219,7 +238,7 @@
<sect2 id="tc-manual.data-races.example" xreflabel="Simple Race">
-<title>A simple data race</title>
+<title>A Simple Data Race</title>
<para>About the simplest possible example of a race is as follows. In
this program, it is impossible to know what the value
@@ -371,13 +390,20 @@
attempts to write to it, a race is then reported.</para>
<para>This technique is known as "lockset inference" and was
-introduced in FIXME. It has been widely implemented since then.
-Thrcheck incorporates several refinements aimed at reducing the false
-error rate generated by a naive version of the algorithm. In section
-FIXME a summary of the complete algorithm used by Thrcheck is
-presented. First, however, it is important to understand details of
-transitions pertaining to the Exclusive-ownership state.</para>
+introduced in: "Eraser: A Dynamic Data Race Detector for Multithreaded
+Programs" (Stefan Savage, Michael Burrows, Greg Nelson, Patrick
+Sobalvarro and Thomas Anderson, ACM Transactions on Computer Systems,
+15(4):391-411, November 1997).</para>
+<para>Lockset inference has since been widely implemented, studied and
+extended. Thrcheck incorporates several refinements aimed at avoiding
+the high false error rate that naive versions of the algorithm suffer
+from. A
+<link linkend="tc-manual.data-races.summary">summary of the complete
+algorithm used by Thrcheck</link> is presented below. First, however,
+it is important to understand details of transitions pertaining to the
+Exclusive-ownership state.</para>
+
</sect2>
@@ -449,10 +475,13 @@
pthread_create and pthread_join calls, an error is still
reported.</para>
-<para>This technique was introduced in FIXME with the name Thread
-Lifetime Segments. Thrcheck implements an extended version of it.
-Specifically, Thrcheck allows transfer of exclusive ownership in the
-following situations:</para>
+<para>This technique was introduced with the name "thread lifetime
+segments" in "Runtime Checking of Multithreaded Applications with
+Visual Threads" (Jerry J. Harrow, Jr, Proceedings of the 7th
+International SPIN Workshop on Model Checking of Software Stanford,
+California, USA, August 2000, LNCS 1885, pp331--342). Thrcheck
+implements an extended version of it. Specifically, Thrcheck allows
+transfer of exclusive ownership in the following situations:</para>
<itemizedlist>
<listitem><para>At thread creation: a child can acquire ownership of
@@ -464,16 +493,16 @@
(thread that is exiting) at the point it exited.</para>
</listitem>
<listitem><para>At condition variable signallings and broadcasts. A
- thread Twait which completes a pthread_cond_wait call as a result of
+ thread Tw which completes a pthread_cond_wait call as a result of
a signal or broadcast on the same condition variable by some other
- thread Tsig, may acquire ownership of memory held exclusively by
- Tsig prior to the pthread_cond_signal/broadcast
+ thread Ts, may acquire ownership of memory held exclusively by
+ Ts prior to the pthread_cond_signal/broadcast
call.</para>
</listitem>
- <listitem><para>At semaphore posts (sem_post) calls. A thread Twait
+ <listitem><para>At semaphore posts (sem_post) calls. A thread Tw
which completes a sem_wait call call as a result of a sem_post call
- on the same semaphore by some other thread Tpost, may acquire
- ownership of memory held exclusively by Tpost prior to the sem_post
+ on the same semaphore by some other thread Tp, may acquire
+ ownership of memory held exclusively by Tp prior to the sem_post
call.</para>
</listitem>
</itemizedlist>
@@ -809,8 +838,9 @@
<para>Also, this message implies that Thrcheck did not see any
synchronisation event between threads #4 and #5 that would have
-allowed #5 to acquire exclusive ownership from #4. See FIXME for a
-discussion of transfers of exclusive ownership states between
+allowed #5 to acquire exclusive ownership from #4. See
+<link linkend="tc-manual.data-races.exclusive">above</link>
+for a discussion of transfers of exclusive ownership states between
threads.</para>
</sect2>
@@ -936,9 +966,10 @@
<para>The root cause of this synchronisation lossage is
particularly hard to understand, so an example is helpful. It was
- discussed at length by Arndt Muehlenfeldt [FIXME]. The canonical
- POSIX-recommended usage scheme for condition variables is as
- follows:</para>
+ discussed at length by Arndt Muehlenfeld ("Runtime Race Detection
+ in Multi-Threaded Programs", Dissertation, TU Graz, Austria). The
+ canonical POSIX-recommended usage scheme for condition variables
+ is as follows:</para>
<programlisting><![CDATA[
b is a Boolean condition, which is False most of the time
@@ -981,9 +1012,9 @@
<listitem>
<para>Make sure you are using a supported Linux distribution. At
present, Thrcheck only properly supports x86-linux and amd64-linux
- with glibc-2.3 or later. The latter restriction really says that
- we only support the NPTL threading library. The old LinuxThreads
- library is not supported.</para>
+ with glibc-2.3 or later. The latter restriction means we only
+ support the NPTL threading library. The old LinuxThreads library
+ is not supported.</para>
<para>Unsupported targets may work to varying degrees. In
particular ppc32-linux and ppc64-linux running NTPL should work,
@@ -992,6 +1023,19 @@
the lwarx/stwcx instructions.</para>
</listitem>
+ <listitem>
+ <para>POSIX requires that implementations of standard I/O (printf,
+ fprintf, fwrite, fread, etc) are thread safe. Unfortunately GNU
+ libc implements this by using internal locking primitives that
+ Thrcheck is unable to intercept. Consequently Thrcheck generates
+ many false race reports when you use these functions.</para>
+
+ <para>Thrcheck attempts to hide these errors using the standard
+ Valgrind error-suppression mechanism. So, at least for simple
+ test cases, you don't see any. Nevertheless, some may slip
+ through. Just something to be aware of.</para>
+ </listitem>
+
</orderedlist>
</sect1>
@@ -1002,48 +1046,39 @@
<sect1 id="tc-manual.options" xreflabel="Thrcheck Options">
<title>Thrcheck Options</title>
-<para>Currently there is only one Thrcheck-specific option:</para>
+<para>The following end-user options are available:</para>
<!-- start of xi:include in the manpage -->
<variablelist id="tc.opts.list">
<varlistentry id="opt.happens-before" xreflabel="--happens-before">
<term>
- <option><![CDATA[--happens-before=none|threads|condvars
- [default: condvars] ]]></option>
+ <option><![CDATA[--happens-before=none|threads|all
+ [default: all] ]]></option>
</term>
<listitem>
- <para>This option is mostly useful for debugging Thrcheck
- itself. It isn't much use to end users and is a bit difficult
- to explain.
- </para>
<para>Thrcheck always regards locks as the basis for
inter-thread synchronisation. However, by default, before
reporting a race error, Thrcheck will also check whether
certain other kinds of inter-thread synchronisation events
happened. It may be that if such events took place, then no
race really occurred, and so no error needs to be reported.
- This enables Thrcheck to correctly handle the
- worker-thread and worker-thread-pool idioms.
+ See <link linkend="tc-manual.data-races.exclusive">above</link>
+ for a discussion of transfers of exclusive ownership states
+ between threads.
</para>
- <para>With <varname>--happens-before=condvars</varname>, both
- thread creation/joinage, and condition variable
- signal/broadcast/waits are regarded as sources of
- synchronisation, and so both the worker-thread and
- worker-thread-pool idioms are correctly handled. "Correctly
- handled" means that Thrcheck will not falsely report race
- errors for correct uses of these idioms.
+ <para>With <varname>--happens-before=all</varname>, the
+ following events are regarded as sources of synchronisation:
+ thread creation/joinage, condition variable
+ signal/broadcast/waits, and semaphore posts/waits.
</para>
<para>With <varname>--happens-before=threads</varname>, only
thread creation/joinage events are regarded as sources of
- synchronisation, and so only the worker-thread idiom is
- correctly handled. The worker-thread-pool is not correctly
- handled.
+ synchronisation.
</para>
<para>With <varname>--happens-before=none</varname>, no events
(apart, of course, from locking) are regarded as sources of
- synchronisation. And so neither the worker-thread nor
- worker-thread-pool idioms are correctly handled.
+ synchronisation.
</para>
<para>Changing this setting from the default will increase your
false-error rate but give little or no gain. The only advantage
@@ -1055,34 +1090,146 @@
</listitem>
</varlistentry>
+ <varlistentry id="opt.trace-addr" xreflabel="--trace-addr">
+ <term>
+ <option><![CDATA[--trace-addr=0xXXYYZZ
+ ]]></option> and
+ <option><![CDATA[--trace-level=0|1|2 [default: 1]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>Requests that Thrcheck produces a log of all state changes
+ to location 0xXXYYZZ. This can be helpful in tracking down
+ tricky races. <varname>--trace-level</varname> controls the
+ verbosity of the log. At the default setting (1), a one-line
+ summary of is printed for each state change. At level 2 a
+ complete stack trace is printed for each state change.</para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
<!-- end of xi:include in the manpage -->
-</sect1>
+<!-- start of xi:include in the manpage -->
+<para>In addition, the following debugging options are available for
+Thrcheck:</para>
+<variablelist id="tc.debugopts.list">
-<sect1 id="tc-manual.otherstuff" xreflabel="Other Stuff">
-<title>Other Stuff</title>
+ <varlistentry id="opt.trace-malloc" xreflabel="--trace-malloc">
+ <term>
+ <option><![CDATA[--trace-malloc=no|yes [no]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>Show all client malloc (etc) and free (etc) requests.</para>
+ </listitem>
+ </varlistentry>
-<para>FIXME: this section will contain other stuff that it is
-important to document:</para>
+ <varlistentry id="opt.gen-vcg" xreflabel="--gen-vcg">
+ <term>
+ <option><![CDATA[--gen-vcg=no|yes|yes-w-vts [no]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>At exit, write to stderr a dump of the happens-before
+ graph computed by Thrcheck, in a format suitable for the VCG
+ graph visualisation tool. A suitable command line is:</para>
+ <para><computeroutput>valgrind --tool=thrcheck
+ --gen-vcg=yes my_app 2>&1
+ | grep xxxxxx | sed "s/xxxxxx//g"
+ | xvcg -</computeroutput></para>
+ <para>With <varname>--gen-vcg=yes</varname>, the basic
+ happens-before graph is shown. With
+ <varname>--gen-vcg=yes-w-vts</varname>, the vector timestamp
+ for each node is also shown.</para>
+ </listitem>
+ </varlistentry>
-<itemizedlist>
- <listitem><para>LOCK prefixes on x86/amd64
- instructions</para></listitem>
- <listitem><para>Reader-writer locks, and semaphores?
- </para></listitem>
- <listitem><para>Other stuff I forgot?
- </para></listitem>
-</itemizedlist>
+ <varlistentry id="opt.cmp-race-err-addrs"
+ xreflabel="--cmp-race-err-addrs">
+ <term>
+ <option><![CDATA[--cmp-race-err-addrs=no|yes [no]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>Controls whether or not race (data) addresses should be
+ taken into account when removing duplicates of race errors.
+ With <varname>--cmp-race-err-addrs=no</varname>, two otherwise
+ identical race errors will be considered to be the same if
+ their race addresses differ. With
+ With <varname>--cmp-race-err-addrs=yes</varname> they will be
+ considered different. This is provided to help make certain
+ regression tests work reliably.</para>
+ </listitem>
+ </varlistentry>
-Inconsistent cv/mx associations
-Thread exiting whilst holding locks
+ <varlistentry id="opt.tc-sanity-flags" xreflabel="--tc-sanity-flags">
+ <term>
+ <option><![CDATA[--tc-sanity-flags=<XXXXX> (X = 0|1) [00000]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>Run extensive sanity checks on Thrcheck's internal
+ data structures at events defined by the bitstring, as
+ follows:</para>
+ <para><computeroutput>10000 </computeroutput>after changes to
+ the lock order acquisition graph</para>
+ <para><computeroutput>01000 </computeroutput>after every client
+ memory access (NB: not currently used)</para>
+ <para><computeroutput>00100 </computeroutput>after every client
+ memory range permission setting of 256 bytes or greater</para>
+ <para><computeroutput>00010 </computeroutput>after every client
+ lock or unlock event</para>
+ <para><computeroutput>00001 </computeroutput>after every client
+ thread creation or joinage event</para>
+ <para>Note these will make Thrcheck run very slowly, often to
+ the point of being completely unusable.</para>
+ </listitem>
+ </varlistentry>
-waiting for a condition variable without holding
- the associated mutex
+</variablelist>
+<!-- end of xi:include in the manpage -->
-better printing of lock cycles
+
</sect1>
+<sect1 id="tc-manual.todolist" xreflabel="To Do List">
+<title>A To-Do List for Thrcheck</title>
+
+<para>The following is a list of loose ends which should be tidied up
+some time.</para>
+
+<itemizedlist>
+ <listitem><para>Track which mutexes are associated with which
+ condition variables, and emit a warning if this becomes
+ inconsistent.</para>
+ </listitem>
+ <listitem><para>For lock order errors, print the complete lock
+ cycle, rather than only doing for size-2 cycles as at
+ present.</para>
+ </listitem>
+ <listitem><para>Document the VALGRIND_HG_CLEAN_MEMORY client
+ request.</para>
+ </listitem>
+ <listitem><para>Possibly a client request to forcibly transfer
+ ownership of memory from one thread to another. Requires further
+ consideration.</para>
+ </listitem>
+ <listitem><para>Add a new client request that marks an address range
+ as being "shared-modified with empty lockset" (the error state),
+ and describe how to use it.</para>
+ </listitem>
+ <listitem><para>Document races caused by gcc's thread-unsafe code
+ generation for speculative stores. In the interim see
+ <computeroutput>http://gcc.gnu.org/ml/gcc/2007-10/msg00266.html
+ </computeroutput>
+ and <computeroutput>http://lkml.org/lkml/2007/10/24/673</computeroutput>.
+ </para>
+ </listitem>
+
+</itemizedlist>
+
+</sect1>
+
</chapter>
|
|
From: Julian S. <js...@ac...> - 2007-11-04 22:29:29
|
On Sunday 04 November 2007 22:45, Nicholas Nethercote wrote: > On Sun, 4 Nov 2007, Julian Seward wrote: > >> Tom, can you send me diffs for some of the Massif ones? Any idea why so > >> many tests fail on this machine? > > > > I have a vanilla Red Hat 7.3 installation on QEMU. Attached are the > > results from that. > > The single cause of all of these appears to be that each program is doing a > heap allocation of size 24 bytes very early on, ie. before any of the > allocations in the main part of the C program are done. We have to > understand why that's happening before we can fix these. Perhaps glibc, or ld.so and its myriad function calls, etc, are doing an allocation before we get to main. Do your tests assume that there are zero allocations prior to main being called? If so, how about having a massif-specific client request which you call at the start of main, to say "forget everything and start with a clean slate" ? > It may explain > why so many Memcheck results are also going wrong for this machine. Ah, good point. J |
|
From: Julian S. <js...@ac...> - 2007-11-04 22:24:24
|
> But the result for deep-D looks strange. Can you send me the massif.out > file for it? Thanks. Attached. J |
|
From: Nicholas N. <nj...@cs...> - 2007-11-04 21:45:58
|
On Sun, 4 Nov 2007, Julian Seward wrote: >> Tom, can you send me diffs for some of the Massif ones? Any idea why so >> many tests fail on this machine? > > I have a vanilla Red Hat 7.3 installation on QEMU. Attached are the > results from that. The single cause of all of these appears to be that each program is doing a heap allocation of size 24 bytes very early on, ie. before any of the allocations in the main part of the C program are done. We have to understand why that's happening before we can fix these. It may explain why so many Memcheck results are also going wrong for this machine. N |
|
From: Nicholas N. <nj...@cs...> - 2007-11-04 21:42:33
|
On Sun, 4 Nov 2007, Julian Seward wrote: >>> massif/tests/culling1 (stderr) >>> massif/tests/culling2 (stderr) >>> massif/tests/deep-C (stderr) >>> massif/tests/peak2 (stderr) >>> massif/tests/realloc (stderr) > > Hi. I don't have reliable access to the 970, but in fact I get these > + 1 more failing when run on our 32-bit ppc box. So I attach those > instead. Thanks. These differences mostly don't look too bad. I think that the parts of the stack trace below "main" go a little deeper than on my machine, so when I print detailed stats about how many XPts are created, there are differences. I guess I'll have to filter those out. But the result for deep-D looks strange. Can you send me the massif.out file for it? Thanks. N |
|
From: <sv...@va...> - 2007-11-04 17:11:19
|
Author: sewardj
Date: 2007-11-04 17:11:19 +0000 (Sun, 04 Nov 2007)
New Revision: 7088
Log:
Loads more stuff.
Modified:
branches/THRCHECK/thrcheck/docs/tc-manual.xml
Modified: branches/THRCHECK/thrcheck/docs/tc-manual.xml
===================================================================
--- branches/THRCHECK/thrcheck/docs/tc-manual.xml 2007-11-04 01:51:04 UTC (rev 7087)
+++ branches/THRCHECK/thrcheck/docs/tc-manual.xml 2007-11-04 17:11:19 UTC (rev 7088)
@@ -478,12 +478,12 @@
</listitem>
</itemizedlist>
+</sect2>
-</sect2>
<sect2 id="tc-manual.data-races.re-excl" xreflabel="Re-Excl Transfers">
-<title>Reacquisition of Exclusive States</title>
+<title>Restoration of Exclusive Ownership</title>
<para>Another common idiom is to partition the lifetime of the program
as a whole into several distinct phases. In some of those phases, a
@@ -539,35 +539,34 @@
thread via a cascade of pthread_join calls, any memory shared by the
group (or a subset of it) ends up being owned exclusively by the sole
surviving thread. This significantly enhances Thrcheck's flexibility,
-since it means that memory can transition arbitrarily many times
-between exclusive and shared states over the lifetime of the program.
-Moreover, locations may be protected by different locks during
-different phases of shared ownership.</para>
+since it means that each memory location may make arbitrarily many
+transitions between exclusive and shared ownership. Furthermore, a
+different lock may protect the location during each period of shared
+ownership.</para>
+</sect2>
+<sect2 id="tc-manual.data-races.summary" xreflabel="Race Det Summary">
+<title>A Summary of the Race Detection Algorithm</title>
-</sect2>
+<para>Thrcheck looks for memory locations which are accessed by more
+than one thread. For each such location, Thrcheck records which of
+the program's locks were held by the accessing thread at the time of
+each access. The hope is to discover that there is indeed at least
+one lock which is consistently used by all threads to protect that
+location. If no such lock can be found, then there is apparently no
+consistent locking strategy being applied for that location, and so a
+possible data race might result. Thrcheck accordingly reports an
+error.</para>
-<para>-------------------------------------------------</para>
+<para>In practice this discipline is far too simplistic, and is
+unusable since it reports many races in some widely used and
+known-correct programming disciplines. Thrcheck's checking therefore
+incorporates many refinements to this basic idea, and can be
+summarised as follows:</para>
-<para>In short, what Thrcheck does is to look for memory locations
-which are accessed by more than one thread. For each such location,
-Thrcheck records which of the program's (pthread_mutex_)locks were
-held by the accessing thread at the time of each access. The hope is
-to discover that there is indeed at least one lock which is
-consistently used by all threads to protect that location. If no such
-lock can be found, then there is apparently no consistent locking
-strategy being applied for that location, and so a possible data race
-might result.</para>
-
-<para>In practice this discipline is far too simplistic,
-and is unusable since it reports many races in some widely used
-and known-correct programming disciplines. Thrcheck's checking
-therefore incorporates many refinements to this basic idea, and
-can be summarised as follows:</para>
-
<para>The following thread events are intercepted and monitored:</para>
<itemizedlist>
@@ -576,11 +575,14 @@
</listitem>
<listitem>
<para>lock acquisition and release (pthread_mutex_lock,
- pthread_mutex_unlock, and variants)</para>
+ pthread_mutex_unlock, pthread_rwlock_rdlock,
+ pthread_rwlock_wrlock,
+ pthread_rwlock_unlock)</para>
</listitem>
<listitem>
<para>inter-thread event notifications (pthread_cond_wait,
- pthread_cond_signal, pthread_cond_broadcast)</para>
+ pthread_cond_signal, pthread_cond_broadcast,
+ sem_wait, sem_post)</para>
</listitem>
</itemizedlist>
@@ -600,7 +602,7 @@
<para>By observing the above events, Thrcheck can infer certain
aspects of the program's locking discipline. Programs which adhere to
-the are considered to be acceptable:
+the following rules are considered to be acceptable:
</para>
<itemizedlist>
@@ -633,61 +635,365 @@
<itemizedlist>
<listitem>
<para>A thread Y can acquire exclusive ownership of memory
- previously owned exclusively by a different thread X providing the
+ previously owned exclusively by a different thread X providing
X's last access and Y's first access are separated by one of the
- following synchronization events: X creates thread Y, or X uses a
- condition-variable to signal at Y, and Y is waiting for that event.
- </para>
+ following synchronization events:</para>
+ <itemizedlist>
+ <listitem><para>X creates thread Y</para></listitem>
+ <listitem><para>X joins back to Y</para></listitem>
+ <listitem><para>X uses a condition-variable to signal at Y, and Y is
+ waiting for that event</para></listitem>
+ <listitem><para>Y completes a semaphore wait as a result of X signalling
+ on that same semaphore</para></listitem>
+ </itemizedlist>
<para>
This refinement allows Thrcheck to correctly track the ownership
state of inter-thread buffers used in the worker-thread and
- worker-thread-pool concurrent programming idioms (styles).
-</para>
+ worker-thread-pool concurrent programming idioms (styles).</para>
</listitem>
<listitem>
- <para>Similarly, if Y later joins back to X, memory exclusively
- owned by Y becomes exclusively owned by X instead. Also, memory
- that has been shared only by X and Y becomes exclusively owned by X.
- More generally, memory that has been shared by X, Y and some
- arbitrary other set S of threads is re-marked as shared by X and S.
- Hence, under the right circumstances, memory shared amongst multiple
- threads, all of which join into just one, can revert to the
- exclusive ownership state.</para>
+ <para>Similarly, if thread Y joins back to thread X, memory
+ exclusively owned by Y becomes exclusively owned by X instead.
+ Also, memory that has been shared only by X and Y becomes
+ exclusively owned by X. More generally, memory that has been shared
+ by X, Y and some arbitrary other set S of threads is re-marked as
+ shared by X and S. Hence, under the right circumstances, memory
+ shared amongst multiple threads, all of which join into just one,
+ can revert to the exclusive ownership state.</para>
<para>
In effect, each memory location may make arbitrarily many
transitions between exclusive and shared ownership. Furthermore, a
different lock may protect the location during each period of shared
ownership. This significantly enhances the flexibility of the
- algorithm.
- </para>
+ algorithm.</para>
</listitem>
</itemizedlist>
<para>The ownership state, accessing thread-set and related lock-set
-for each memory location are tracked at 32-bit granularity. This keeps
-the memory overhead tolerable, but it means the algorithm is imprecise
-for 16- and 8-bit memory accesses. Future work may lead to an
-implementation capable of tracking memory at 8-bit granularity
-without excessive space and time overheads.</para>
+for each memory location are tracked at 8-bit granularity. This means
+the algorithm is precise even for 16- and 8-bit memory
+accesses.</para>
-</sect1>
+<para>Thrcheck correctly handles reader-writer locks in this
+framework. Locations shared between multiple threads can be protected
+during reads by locks held in either read-mode or write-mode, but can
+only be protected during writes by locks held in write-mode. Normal
+POSIX mutexes are treated as if they are reader-writer locks which are
+only ever held in write-mode.</para>
+<para>Thrcheck correctly handles POSIX mutexes for which recursive
+locking is allowed.</para>
+<para>Thrcheck partially correctly handles x86 and amd64 memory access
+instructions preceded by a LOCK prefix. Writes are correctly handled,
+by pretending that the LOCK prefix implies acquisition and release of
+a magic "bus hardware lock" mutex before and after the instruction.
+This unfortunately requires subsequent reads from such locations to
+also use a LOCK prefix, which is not required by the real hardware.
+Thrcheck does not offer any equivalent handling for atomic sequences
+on PowerPC/POWER platforms created by the use of lwarx/stwcx
+instructions.</para>
+</sect2>
+
+
+
+<sect2 id="tc-manual.data-races.errmsgs" xreflabel="Race Error Messages">
+<title>Interpreting Race Error Messages</title>
+
+<para>Thrcheck's race detection algorithm collects a lot of
+information, and tries to present it in a helpful way when a race is
+detected. Here's an example:</para>
+
+<programlisting><![CDATA[
+Thread #2 was created
+ at 0x510548E: clone (in /lib64/libc-2.5.so)
+ by 0x4E2F305: do_clone (in /lib64/libpthread-2.5.so)
+ by 0x4E2F7C5: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.5.so)
+ by 0x4C23870: pthread_create@* (tc_intercepts.c:198)
+ by 0x400CEF: main (tc17_sembar.c:195)
+
+// And the same for threads #3, #4 and #5 -- omitted for conciseness
+
+Possible data race during read of size 4 at 0x602174
+ at 0x400BE5: gomp_barrier_wait (tc17_sembar.c:122)
+ by 0x400C44: child (tc17_sembar.c:161)
+ by 0x4C25DF7: mythread_wrapper (tc_intercepts.c:178)
+ by 0x4E2F09D: start_thread (in /lib64/libpthread-2.5.so)
+ by 0x51054CC: clone (in /lib64/libc-2.5.so)
+ Old state: shared-modified by threads #2, #3, #4, #5
+ New state: shared-modified by threads #2, #3, #4, #5
+ Reason: this thread, #2, holds no consistent locks
+ Last consistently used lock for 0x602174 was first observed
+ at 0x4C25D01: pthread_mutex_init (tc_intercepts.c:326)
+ by 0x4009E4: gomp_barrier_init (tc17_sembar.c:46)
+ by 0x400CBC: main (tc17_sembar.c:192)
+]]></programlisting>
+
+<para>Thrcheck first announces the creation points of any threads
+referenced in the error message. This is so it can speak concisely
+about threads and sets of threads without repeatedly printing their
+creation point call stacks. Each thread is only ever announced once,
+the first time it appears in any Thrcheck error message.</para>
+
+<para>The main error message begins at the text
+"<computeroutput>Possible data race during read</computeroutput>".
+At the start is information you would expect to see -- address and
+size of the racing access, whether a read or a write, and the call
+stack at the point it was detected.</para>
+
+<para>More interesting is the state transition caused by this access.
+This memory is already in the shared-modified state, and up to now has
+been consistently protected by at least one lock. However, the thread
+making the access in question (thread #2, here) does not hold any
+locks in common with those held during all previous accesses to the
+location -- "no consistent locks", in other words.</para>
+
+<para>Finally, Thrcheck shows the lock which has protected this
+location in all previous accesses. (If there is more than one, only
+one is shown). This can be a useful hint, because it typically shows
+the lock that the programmers intended to use to protect the location,
+but in this case forgot.</para>
+
+<para>Here are some more examples of race reports. This not an
+exhaustive list of combinations, but should give you some insight into
+how to interpret the output.</para>
+
+<programlisting><![CDATA[
+Possible data race during write ...
+ Old state: shared-readonly by threads #1, #2, #3
+ New state: shared-modified by threads #1, #2, #3
+ Reason: this thread, #3, holds no consistent locks
+ Location ... has never been protected by any lock
+]]></programlisting>
+
+<para>The location is shared by 3 threads, all of which have been
+reading it without locking ("has never been protected by any lock").
+Now one of them is writing it. Regardless of whether the writer has a
+lock or not, this is still an error, because the write races against
+the previously observed reads.</para>
+
+<programlisting><![CDATA[
+Possible data race during read ...
+ Old state: shared-modified by threads #1, #2, #3
+ New state: shared-modified by threads #1, #2, #3
+ Reason: this thread, #3, holds no consistent locks
+ Last consistently used lock for ... was first observed ...
+]]></programlisting>
+
+<para>The location is shared by 3 threads, all of which have been
+reading and writing it while (as required) holding at least one lock
+in common. Now it is being read without that lock being held. In the
+"Last consistently used lock" part, Thrcheck offers its best guess as
+to the identity of the lock that should have been used.</para>
+
+<programlisting><![CDATA[
+Possible data race during write ...
+ Old state: owned exclusively by thread #4
+ New state: shared-modified by threads #4, #5
+ Reason: this thread, #5, holds no locks at all
+]]></programlisting>
+
+<para>A location that has so far been accessed exclusively by thread
+#4 has now been written by thread #5, without use of any lock. This
+can be a sign that the programmer did not consider the possibility of
+the location being shared between threads, or, alternatively, forgot
+to use the appropriate lock.</para>
+
+<para>Note that thread #4 exclusively owns the location, and so has
+the right to access it without holding a lock. However, this message
+does not say that thread #4 is not using a lock for this location.
+Indeed, it could be using a lock for the location because it intends
+to make it available to other threads, one of which is thread #5 --
+and thread #5 has forgotten to use the lock.</para>
+
+<para>Also, this message implies that Thrcheck did not see any
+synchronisation event between threads #4 and #5 that would have
+allowed #5 to acquire exclusive ownership from #4. See FIXME for a
+discussion of transfers of exclusive ownership states between
+threads.</para>
+
+</sect2>
+
+
+</sect1>
+
<sect1 id="tc-manual.effective-use" xreflabel="Thrcheck Effective Use">
<title>Hints and Tips for Effective Use of Thrcheck</title>
-
<para>Thrcheck can be very helpful in finding and resolving
threading-related problems. Like all sophisticated tools, it is most
-effective when you have some level of understanding of what the tool
-is doing. Thrcheck will be less effective when you merely throw an
+effective when you understand how to play to its strengths.</para>
+
+<para>Thrcheck will be less effective when you merely throw an
existing threaded program at it and try to make sense of any reported
errors. It will be more effective if you design threaded programs
from the start in a way that helps Thrcheck verify correctness. The
same is true for finding memory errors with Memcheck, but applies more
-here, because thread checking is a harder problem.</para>
+here, because thread checking is a harder problem. Consequently it is
+much easier to write a correct program for which Thrcheck falsely
+reports (threading) errors than it is to write a correct program for
+which Memcheck falsely reports (memory) errors.</para>
+<para>With that in mind, here are some tips, listed most important first,
+for getting reliable results and avoiding false errors. The first two
+are critical. Any violations of them will swamp you with huge numbers
+of false data-race errors.</para>
+
+
+<orderedlist>
+
+ <listitem>
+ <para>Make sure your application, and all the libraries it uses,
+ use the POSIX threading primitives. Thrcheck needs to be able to
+ see all events pertaining to thread creation, exit, locking and
+ other syncronisation events. To do so it intercepts many POSIX
+ pthread_ functions.</para>
+
+ <para>Do not roll your own threading primitives (mutexes, etc)
+ from combinations of the Linux futex syscall, counters and wotnot.
+ These throw Thrcheck's internal what's-going-on models way off
+ course and will give bogus results.</para>
+
+ <para>Also, do not reimplement existing POSIX abstractions using
+ other POSIX abstractions. For example, don't build your own
+ semaphore routines or reader-writer locks from POSIX mutexes and
+ condition variables. Instead use POSIX reader-writer locks and
+ semaphores directly, since Thrcheck supports them directly.</para>
+
+ <para>Thrcheck directly supports the following POSIX threading
+ abstractions: mutexes, reader-writer locks, condition variables
+ (but see below), and semaphores. Currently spinlocks and barriers
+ are not supported, although they could be in future. See below
+ for a "safe" alternative implementation of barriers.</para>
+
+ <para>At the time of writing, the following popular Linux packages
+ are known to implement their own threading primitives:</para>
+
+ <itemizedlist>
+ <listitem><para>Qt version 4.X. Qt 3.X is fine, but not 4.X.
+ Thrcheck contains partial direct support for Qt 4.X threading,
+ but this is not yet in a usable state. Assistance from folks
+ knowledgeable in Qt 4 threading internals would be
+ appreciated.</para></listitem>
+
+ <listitem><para>Runtime support library for GNU OpenMP (part of
+ GCC), at least GCC versions 4.2 and 4.3. With some minor effort
+ of modifying the GNU OpenMP runtime support sources, it is
+ possible to use Thrcheck on GNU OpenMP compiled codes. Please
+ contact the Valgrind authors for details.</para></listitem>
+ </itemizedlist>
+ </listitem>
+
+ <listitem>
+ <para>Avoid memory recycling. If you can't avoid it, you must use
+ tell Thrcheck what is going on via the VALGRIND_HG_CLEAN_MEMORY
+ client request
+ (in <computeroutput>thrcheck.h</computeroutput>).</para>
+
+ <para>Thrcheck is aware of standard memory allocation and
+ deallocation that occurs via malloc/free/new/delete and from entry
+ and exit of stack frames. In particular, when memory is
+ deallocated via free, delete, or function exit, Thrcheck considers
+ that memory clean, so when it is eventually reallocated, its
+ history is irrelevant.</para>
+
+ <para>However, it is common practice to implement memory recycling
+ schemes. In these, memory to be freed is not handed to
+ malloc/delete, but instead put into a pool of free buffers to be
+ handed out again as required. The problem is that Thrcheck has no
+ way to know that such memory is logically no longer in use, and
+ its history is irrelevant. Hence you must make that explicit,
+ using the VALGRIND_HG_CLEAN_MEMORY client request to specify the
+ relevant address ranges. It's easiest to put these requests into
+ the pool manager code, and use them either when memory is returned
+ to the pool, or is allocated from it.</para>
+ </listitem>
+
+ <listitem>
+ <para>Avoid POSIX condition variables. If you can, use POSIX
+ semaphores (sem_t, sem_post, sem_wait) to do inter-thread event
+ signalling. Semaphores with an initial value of zero are
+ particularly useful for this.</para>
+
+ <para>Thrcheck only partially correctly handles POSIX condition
+ variables. This is because Thrcheck can see inter-thread
+ dependencies between a pthread_cond_wait call and a
+ pthread_cond_signal/broadcast call only if the waiting thread
+ actually gets to the rendezvous first (so that it actually calls
+ pthread_cond_wait). It can't see dependencies between the threads
+ if the signaller arrives first. In the latter case, POSIX
+ guidelines imply that the associated boolean condition still
+ provides an inter-thread synchronisation event, but one which is
+ invisible to Thrcheck.</para>
+
+ <para>The result of Thrcheck missing some inter-thread
+ synchronisation events is to cause it to report false positives.
+ That's because missing such events reduces the extent to which it
+ can transfer exclusive memory ownership between threads. So
+ memory may end up in a shared-modified state when that was not
+ intended by the application programmers.</para>
+
+ <para>The root cause of this synchronisation lossage is
+ particularly hard to understand, so an example is helpful. It was
+ discussed at length by Arndt Muehlenfeldt [FIXME]. The canonical
+ POSIX-recommended usage scheme for condition variables is as
+ follows:</para>
+
+<programlisting><![CDATA[
+b is a Boolean condition, which is False most of the time
+cv is a condition variable
+mx is its associated mutex
+
+Signaller: Waiter:
+
+lock(mx) lock(mx)
+b = True while (b == False)
+signal(cv) wait(cv,mx)
+unlock(mx) unlock(mx)
+]]></programlisting>
+
+ <para>Assume <computeroutput>b</computeroutput> is False most of
+ the time. If the waiter arrives at the rendezvous first, it
+ enters its while-loop, waits for the signaller to signal, and
+ eventually proceeds. Thrcheck sees the signal, notes the
+ dependency, and all is well.</para>
+
+ <para>If the signaller arrives
+ first, <computeroutput>b</computeroutput> is set to true, and the
+ signal disappears into nowhere. When the waiter later arrives, it
+ does not enter its while-loop and simply carries on. But even in
+ this case, the waiter code following the while-loop cannot execute
+ until the signaller sets <computeroutput>b</computeroutput> to
+ True. Hence there is still the same inter-thread dependency, but
+ this time it is through an arbitrary in-memory condition, and
+ Thrcheck cannot see it.</para>
+
+ <para>By comparison, Thrcheck's detection of inter-thread
+ dependencies caused by semaphore operations is believed to be
+ exactly correct.</para>
+
+ <para>As far as I know, a solution to this problem that does not
+ require source-level annotation of condition-variable wait loops
+ is beyond the current state of the art.</para>
+ </listitem>
+
+ <listitem>
+ <para>Make sure you are using a supported Linux distribution. At
+ present, Thrcheck only properly supports x86-linux and amd64-linux
+ with glibc-2.3 or later. The latter restriction really says that
+ we only support the NPTL threading library. The old LinuxThreads
+ library is not supported.</para>
+
+ <para>Unsupported targets may work to varying degrees. In
+ particular ppc32-linux and ppc64-linux running NTPL should work,
+ but you will get false race errors because Thrcheck does not know
+ how to properly handle atomic instruction sequences created using
+ the lwarx/stwcx instructions.</para>
+ </listitem>
+
+</orderedlist>
+
</sect1>
|
|
From: Nicholas N. <nj...@cs...> - 2007-11-04 05:25:13
|
On Sun, 4 Nov 2007, Tom Hughes wrote: > Nightly build on alvis ( i686, Red Hat 7.3 ) started at 2007-11-04 03:15:02 GMT > Results differ from 24 hours ago > > Checking out valgrind source tree ... done > Configuring valgrind ... done > Building valgrind ... done > Running regression tests ... failed > > Regression test results follow > > == 287 tests, 33 stderr failures, 1 stdout failure, 27 post failures == > memcheck/tests/addressable (stderr) > memcheck/tests/badjump (stderr) > memcheck/tests/describe-block (stderr) > memcheck/tests/erringfds (stderr) > memcheck/tests/leak-0 (stderr) > memcheck/tests/leak-cycle (stderr) > memcheck/tests/leak-pool-0 (stderr) > memcheck/tests/leak-pool-1 (stderr) > memcheck/tests/leak-pool-2 (stderr) > memcheck/tests/leak-pool-3 (stderr) > memcheck/tests/leak-pool-4 (stderr) > memcheck/tests/leak-pool-5 (stderr) > memcheck/tests/leak-regroot (stderr) > memcheck/tests/leak-tree (stderr) > memcheck/tests/long_namespace_xml (stderr) > memcheck/tests/match-overrun (stderr) > memcheck/tests/partial_load_dflt (stderr) > memcheck/tests/partial_load_ok (stderr) > memcheck/tests/partiallydefinedeq (stderr) > memcheck/tests/pointer-trace (stderr) > memcheck/tests/sigkill (stderr) > memcheck/tests/stack_changes (stderr) > memcheck/tests/x86/scalar (stderr) > memcheck/tests/x86/scalar_supp (stderr) > memcheck/tests/x86/xor-undef-x86 (stderr) > memcheck/tests/xml1 (stderr) > massif/tests/alloc-fns-A (post) > massif/tests/alloc-fns-B (post) > massif/tests/basic (post) > massif/tests/big-alloc (post) > massif/tests/culling1 (stderr) > massif/tests/culling2 (stderr) > massif/tests/custom_alloc (post) > massif/tests/deep-A (post) > massif/tests/deep-B (stderr) > massif/tests/deep-B (post) > massif/tests/deep-C (stderr) > massif/tests/deep-C (post) > massif/tests/deep-D (post) > massif/tests/ignoring (post) > massif/tests/insig (post) > massif/tests/long-time (post) > massif/tests/new-cpp (post) > massif/tests/null (post) > massif/tests/one (post) > massif/tests/overloaded-new (post) > massif/tests/peak (post) > massif/tests/peak2 (stderr) > massif/tests/peak2 (post) > massif/tests/realloc (stderr) > massif/tests/realloc (post) > massif/tests/thresholds_0_0 (post) > massif/tests/thresholds_0_10 (post) > massif/tests/thresholds_10_0 (post) > massif/tests/thresholds_10_10 (post) > massif/tests/thresholds_5_0 (post) > massif/tests/thresholds_5_10 (post) > massif/tests/zero1 (post) > massif/tests/zero2 (post) > none/tests/mremap (stderr) > none/tests/mremap2 (stdout) Tom, can you send me diffs for some of the Massif ones? Any idea why so many tests fail on this machine? Nick |
|
From: Nicholas N. <nj...@cs...> - 2007-11-04 05:23:18
|
On Sun, 4 Nov 2007 js...@ac... wrote: > Nightly build on g5 ( SuSE 10.1, ppc970 ) started at 2007-11-04 02:00:01 CET > Results unchanged from 24 hours ago > > Checking out valgrind source tree ... done > Configuring valgrind ... done > Building valgrind ... done > Running regression tests ... failed > > Regression test results follow > > == 255 tests, 11 stderr failures, 2 stdout failures, 0 post failures == > memcheck/tests/deep_templates (stdout) > memcheck/tests/leak-cycle (stderr) > memcheck/tests/leak-tree (stderr) > memcheck/tests/pointer-trace (stderr) > massif/tests/culling1 (stderr) > massif/tests/culling2 (stderr) > massif/tests/deep-C (stderr) > massif/tests/peak2 (stderr) > massif/tests/realloc (stderr) > none/tests/faultstatus (stderr) > none/tests/fdleak_cmsg (stderr) > none/tests/mremap (stderr) > none/tests/mremap2 (stdout) Julian, can you send me diffs for one or all of the Massif ones? Nick |
|
From: Tom H. <th...@cy...> - 2007-11-04 03:28:05
|
Nightly build on alvis ( i686, Red Hat 7.3 ) started at 2007-11-04 03:15:02 GMT Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 287 tests, 33 stderr failures, 1 stdout failure, 27 post failures == memcheck/tests/addressable (stderr) memcheck/tests/badjump (stderr) memcheck/tests/describe-block (stderr) memcheck/tests/erringfds (stderr) memcheck/tests/leak-0 (stderr) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-pool-0 (stderr) memcheck/tests/leak-pool-1 (stderr) memcheck/tests/leak-pool-2 (stderr) memcheck/tests/leak-pool-3 (stderr) memcheck/tests/leak-pool-4 (stderr) memcheck/tests/leak-pool-5 (stderr) memcheck/tests/leak-regroot (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/long_namespace_xml (stderr) memcheck/tests/match-overrun (stderr) memcheck/tests/partial_load_dflt (stderr) memcheck/tests/partial_load_ok (stderr) memcheck/tests/partiallydefinedeq (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/sigkill (stderr) memcheck/tests/stack_changes (stderr) memcheck/tests/x86/scalar (stderr) memcheck/tests/x86/scalar_supp (stderr) memcheck/tests/x86/xor-undef-x86 (stderr) memcheck/tests/xml1 (stderr) massif/tests/alloc-fns-A (post) massif/tests/alloc-fns-B (post) massif/tests/basic (post) massif/tests/big-alloc (post) massif/tests/culling1 (stderr) massif/tests/culling2 (stderr) massif/tests/custom_alloc (post) massif/tests/deep-A (post) massif/tests/deep-B (stderr) massif/tests/deep-B (post) massif/tests/deep-C (stderr) massif/tests/deep-C (post) massif/tests/deep-D (post) massif/tests/ignoring (post) massif/tests/insig (post) massif/tests/long-time (post) massif/tests/new-cpp (post) massif/tests/null (post) massif/tests/one (post) massif/tests/overloaded-new (post) massif/tests/peak (post) massif/tests/peak2 (stderr) massif/tests/peak2 (post) massif/tests/realloc (stderr) massif/tests/realloc (post) massif/tests/thresholds_0_0 (post) massif/tests/thresholds_0_10 (post) massif/tests/thresholds_10_0 (post) massif/tests/thresholds_10_10 (post) massif/tests/thresholds_5_0 (post) massif/tests/thresholds_5_10 (post) massif/tests/zero1 (post) massif/tests/zero2 (post) none/tests/mremap (stderr) none/tests/mremap2 (stdout) ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Last 20 lines of verbose log follow echo fi gcc -Winline -Wall -Wshadow -g -m32 -Wno-long-long -o long-time long-time.o if g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I../.. -I../../include -I../../coregrind -I../../include -I../../VEX/pub -g -O2 -MT new-cpp.o -MD -MP -MF ".deps/new-cpp.Tpo" \ -c -o new-cpp.o `test -f 'new-cpp.cpp' || echo './'`new-cpp.cpp; \ then mv -f ".deps/new-cpp.Tpo" ".deps/new-cpp.Po"; \ else rm -f ".deps/new-cpp.Tpo"; exit 1; \ fi new-cpp.cpp: In function `int main()': new-cpp.cpp:20: invalid use of undefined type `struct main()::s' new-cpp.cpp:20: forward declaration of `struct main()::s' new-cpp.cpp:20: ISO C++ forbids defining types within new make[4]: *** [new-cpp.o] Error 1 make[4]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif/tests' make[3]: *** [check-am] Error 2 make[3]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif/tests' make[2]: *** [check-recursive] Error 1 make[2]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif' make[1]: *** [check-recursive] Error 1 make[1]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind' make: *** [check] Error 2 ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Sun Nov 4 03:20:17 2007 --- new.short Sun Nov 4 03:28:04 2007 *************** *** 6,27 **** ! Last 20 lines of verbose log follow echo ! fi ! gcc -Winline -Wall -Wshadow -g -m32 -Wno-long-long -o long-time long-time.o ! if g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I../.. -I../../include -I../../coregrind -I../../include -I../../VEX/pub -g -O2 -MT new-cpp.o -MD -MP -MF ".deps/new-cpp.Tpo" \ ! -c -o new-cpp.o `test -f 'new-cpp.cpp' || echo './'`new-cpp.cpp; \ ! then mv -f ".deps/new-cpp.Tpo" ".deps/new-cpp.Po"; \ ! else rm -f ".deps/new-cpp.Tpo"; exit 1; \ ! fi ! new-cpp.cpp: In function `int main()': ! new-cpp.cpp:20: invalid use of undefined type `struct main()::s' ! new-cpp.cpp:20: forward declaration of `struct main()::s' ! new-cpp.cpp:20: ISO C++ forbids defining types within new ! make[4]: *** [new-cpp.o] Error 1 ! make[4]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif/tests' ! make[3]: *** [check-am] Error 2 ! make[3]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif/tests' ! make[2]: *** [check-recursive] Error 1 ! make[2]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif' ! make[1]: *** [check-recursive] Error 1 ! make[1]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind' ! make: *** [check] Error 2 --- 6,71 ---- ! Regression test results follow ! ! == 287 tests, 33 stderr failures, 1 stdout failure, 27 post failures == ! memcheck/tests/addressable (stderr) ! memcheck/tests/badjump (stderr) ! memcheck/tests/describe-block (stderr) ! memcheck/tests/erringfds (stderr) ! memcheck/tests/leak-0 (stderr) ! memcheck/tests/leak-cycle (stderr) ! memcheck/tests/leak-pool-0 (stderr) ! memcheck/tests/leak-pool-1 (stderr) ! memcheck/tests/leak-pool-2 (stderr) ! memcheck/tests/leak-pool-3 (stderr) ! memcheck/tests/leak-pool-4 (stderr) ! memcheck/tests/leak-pool-5 (stderr) ! memcheck/tests/leak-regroot (stderr) ! memcheck/tests/leak-tree (stderr) ! memcheck/tests/long_namespace_xml (stderr) ! memcheck/tests/match-overrun (stderr) ! memcheck/tests/partial_load_dflt (stderr) ! memcheck/tests/partial_load_ok (stderr) ! memcheck/tests/partiallydefinedeq (stderr) ! memcheck/tests/pointer-trace (stderr) ! memcheck/tests/sigkill (stderr) ! memcheck/tests/stack_changes (stderr) ! memcheck/tests/x86/scalar (stderr) ! memcheck/tests/x86/scalar_supp (stderr) ! memcheck/tests/x86/xor-undef-x86 (stderr) ! memcheck/tests/xml1 (stderr) ! massif/tests/alloc-fns-A (post) ! massif/tests/alloc-fns-B (post) ! massif/tests/basic (post) ! massif/tests/big-alloc (post) ! massif/tests/culling1 (stderr) ! massif/tests/culling2 (stderr) ! massif/tests/custom_alloc (post) ! massif/tests/deep-A (post) ! massif/tests/deep-B (stderr) ! massif/tests/deep-B (post) ! massif/tests/deep-C (stderr) ! massif/tests/deep-C (post) ! massif/tests/deep-D (post) ! massif/tests/ignoring (post) ! massif/tests/insig (post) ! massif/tests/long-time (post) ! massif/tests/new-cpp (post) ! massif/tests/null (post) ! massif/tests/one (post) ! massif/tests/overloaded-new (post) ! massif/tests/peak (post) ! massif/tests/peak2 (stderr) ! massif/tests/peak2 (post) ! massif/tests/realloc (stderr) ! massif/tests/realloc (post) ! massif/tests/thresholds_0_0 (post) ! massif/tests/thresholds_0_10 (post) ! massif/tests/thresholds_10_0 (post) ! massif/tests/thresholds_10_10 (post) ! massif/tests/thresholds_5_0 (post) ! massif/tests/thresholds_5_10 (post) ! massif/tests/zero1 (post) ! massif/tests/zero2 (post) ! none/tests/mremap (stderr) ! none/tests/mremap2 (stdout) ! |
|
From: Tom H. <th...@cy...> - 2007-11-04 03:24:27
|
Nightly build on lloyd ( x86_64, Fedora 7 ) started at 2007-11-04 03:05:06 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 320 tests, 4 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) memcheck/tests/vcpu_fnfns (stdout) memcheck/tests/x86/scalar (stderr) memcheck/tests/xml1 (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) |
|
From: Tom H. <th...@cy...> - 2007-11-04 03:23:31
|
Nightly build on dellow ( x86_64, Fedora 7 ) started at 2007-11-04 03:10:04 GMT Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 320 tests, 4 stderr failures, 3 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) memcheck/tests/vcpu_fnfns (stdout) memcheck/tests/x86/scalar (stderr) memcheck/tests/xml1 (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) none/tests/pth_detached (stdout) ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 320 tests, 4 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) memcheck/tests/vcpu_fnfns (stdout) memcheck/tests/x86/scalar (stderr) memcheck/tests/xml1 (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Sun Nov 4 03:16:53 2007 --- new.short Sun Nov 4 03:23:31 2007 *************** *** 8,10 **** ! == 320 tests, 4 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) --- 8,10 ---- ! == 320 tests, 4 stderr failures, 3 stdout failures, 0 post failures == memcheck/tests/pointer-trace (stderr) *************** *** 15,16 **** --- 15,17 ---- none/tests/mremap2 (stdout) + none/tests/pth_detached (stdout) |
|
From: Tom H. <th...@cy...> - 2007-11-04 03:08:08
|
Nightly build on gill ( x86_64, Fedora Core 2 ) started at 2007-11-04 03:00:02 GMT Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 322 tests, 6 stderr failures, 1 stdout failure, 0 post failures == memcheck/tests/pointer-trace (stderr) memcheck/tests/stack_switch (stderr) memcheck/tests/x86/scalar (stderr) memcheck/tests/x86/scalar_supp (stderr) none/tests/fdleak_fcntl (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Last 20 lines of verbose log follow echo then mv -f ".deps/long-time.Tpo" ".deps/long-time.Po"; else rm -f ".deps/long-time.Tpo"; exit 1; fi long-time.c: In function `main': long-time.c:15: warning: ISO C90 forbids mixed declarations and code long-time.c:18: warning: ISO C90 forbids mixed declarations and code gcc -Winline -Wall -Wshadow -g -m64 -Wno-long-long -Wdeclaration-after-statement -o long-time long-time.o if g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I../.. -I../../include -I../../coregrind -I../../include -I../../VEX/pub -g -O2 -MT new-cpp.o -MD -MP -MF ".deps/new-cpp.Tpo" -c -o new-cpp.o new-cpp.cpp; \ then mv -f ".deps/new-cpp.Tpo" ".deps/new-cpp.Po"; else rm -f ".deps/new-cpp.Tpo"; exit 1; fi new-cpp.cpp: In function `int main()': new-cpp.cpp:20: error: invalid use of undefined type `struct main()::s' new-cpp.cpp:20: error: forward declaration of `struct main()::s' new-cpp.cpp:20: error: ISO C++ forbids defining types within new make[4]: *** [new-cpp.o] Error 1 make[4]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif/tests' make[3]: *** [check-am] Error 2 make[3]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif/tests' make[2]: *** [check-recursive] Error 1 make[2]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif' make[1]: *** [check-recursive] Error 1 make[1]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind' make: *** [check] Error 2 ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Sun Nov 4 03:02:34 2007 --- new.short Sun Nov 4 03:08:06 2007 *************** *** 6,27 **** ! Last 20 lines of verbose log follow echo ! then mv -f ".deps/long-time.Tpo" ".deps/long-time.Po"; else rm -f ".deps/long-time.Tpo"; exit 1; fi ! long-time.c: In function `main': ! long-time.c:15: warning: ISO C90 forbids mixed declarations and code ! long-time.c:18: warning: ISO C90 forbids mixed declarations and code ! gcc -Winline -Wall -Wshadow -g -m64 -Wno-long-long -Wdeclaration-after-statement -o long-time long-time.o ! if g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I../.. -I../../include -I../../coregrind -I../../include -I../../VEX/pub -g -O2 -MT new-cpp.o -MD -MP -MF ".deps/new-cpp.Tpo" -c -o new-cpp.o new-cpp.cpp; \ ! then mv -f ".deps/new-cpp.Tpo" ".deps/new-cpp.Po"; else rm -f ".deps/new-cpp.Tpo"; exit 1; fi ! new-cpp.cpp: In function `int main()': ! new-cpp.cpp:20: error: invalid use of undefined type `struct main()::s' ! new-cpp.cpp:20: error: forward declaration of `struct main()::s' ! new-cpp.cpp:20: error: ISO C++ forbids defining types within new ! make[4]: *** [new-cpp.o] Error 1 ! make[4]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif/tests' ! make[3]: *** [check-am] Error 2 ! make[3]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif/tests' ! make[2]: *** [check-recursive] Error 1 ! make[2]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind/massif' ! make[1]: *** [check-recursive] Error 1 ! make[1]: Leaving directory `/tmp/vgtest/2007-11-04/valgrind' ! make: *** [check] Error 2 --- 6,17 ---- ! Regression test results follow ! ! == 322 tests, 6 stderr failures, 1 stdout failure, 0 post failures == ! memcheck/tests/pointer-trace (stderr) ! memcheck/tests/stack_switch (stderr) ! memcheck/tests/x86/scalar (stderr) ! memcheck/tests/x86/scalar_supp (stderr) ! none/tests/fdleak_fcntl (stderr) ! none/tests/mremap (stderr) ! none/tests/mremap2 (stdout) ! |
|
From: <sv...@va...> - 2007-11-04 01:51:04
|
Author: sewardj
Date: 2007-11-04 01:51:04 +0000 (Sun, 04 Nov 2007)
New Revision: 7087
Log:
Issue an error message when an exiting thread holds a lock. This is
an obviously unsafe thing to do and is very easy to detect.
Added:
branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.c
branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stderr.exp-glibc25-amd64
branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stdout.exp
branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.vgtest
Modified:
branches/THRCHECK/glibc-2.X-thrcheck.supp
branches/THRCHECK/thrcheck/tc_main.c
branches/THRCHECK/thrcheck/tests/Makefile.am
Modified: branches/THRCHECK/glibc-2.X-thrcheck.supp
===================================================================
--- branches/THRCHECK/glibc-2.X-thrcheck.supp 2007-11-03 23:26:57 UTC (rev 7086)
+++ branches/THRCHECK/glibc-2.X-thrcheck.supp 2007-11-04 01:51:04 UTC (rev 7087)
@@ -68,6 +68,26 @@
fun:*
obj:/lib*/libc-2.5.so
}
+{
+ thrcheck-glibc25-011
+ Thrcheck:Race
+ obj:/lib*/libc-2.5.so
+ obj:/lib*/libpthread-2.5.so
+}
+{
+ thrcheck-glibc25-013
+ Thrcheck:Race
+ obj:/lib*/ld-2.5.so
+ fun:*
+ obj:/lib*/ld-2.5.so
+}
+{
+ thrcheck-glibc25-014
+ Thrcheck:Race
+ obj:/lib*/ld-2.5.so
+ obj:/lib*/ld-2.5.so
+ obj:/lib*/libpthread-2.5.so
+}
# These are very ugly. They are needed to suppress errors inside (eg)
# NPTL's pthread_cond_signal. Why only one stack frame -- at least we
Modified: branches/THRCHECK/thrcheck/tc_main.c
===================================================================
--- branches/THRCHECK/thrcheck/tc_main.c 2007-11-03 23:26:57 UTC (rev 7086)
+++ branches/THRCHECK/thrcheck/tc_main.c 2007-11-04 01:51:04 UTC (rev 7087)
@@ -5515,6 +5515,7 @@
static
void evh__pre_thread_ll_exit ( ThreadId quit_tid )
{
+ Int nHeld;
Thread* thr_q;
if (SHOW_EVENTS >= 1)
VG_(printf)("evh__pre_thread_ll_exit(thr=%d)\n",
@@ -5529,10 +5530,25 @@
finished, and so we need to consider the possibility that it
lingers indefinitely and continues to interact with other
threads. */
+ /* However, it might have rendezvous'd with a thread that called
+ pthread_join with this one as arg, prior to this point (that's
+ how NPTL works). In which case there has already been a prior
+ sync event. So in any case, just let the thread exit. On NPTL,
+ all thread exits go through here. */
tl_assert(is_sane_ThreadId(quit_tid));
thr_q = map_threads_maybe_lookup( quit_tid );
tl_assert(thr_q != NULL);
- // FIXME: error-if: exiting thread holds any locks
+
+ /* Complain if this thread holds any locks. */
+ nHeld = TC_(cardinalityWS)( univ_lsets, thr_q->locksetA );
+ tl_assert(nHeld >= 0);
+ if (nHeld > 0) {
+ HChar buf[80];
+ VG_(sprintf)(buf, "Exiting thread still holds %d lock%s",
+ nHeld, nHeld > 1 ? "s" : "");
+ record_error_Misc( thr_q, buf );
+ }
+
/* About the only thing we do need to do is clear the map_threads
entry, in order that the Valgrind core can re-use it. */
map_threads_delete( quit_tid );
Modified: branches/THRCHECK/thrcheck/tests/Makefile.am
===================================================================
--- branches/THRCHECK/thrcheck/tests/Makefile.am 2007-11-03 23:26:57 UTC (rev 7086)
+++ branches/THRCHECK/thrcheck/tests/Makefile.am 2007-11-04 01:51:04 UTC (rev 7087)
@@ -75,7 +75,9 @@
tc20_verifywrap.stderr.exp-glibc25-x86 \
tc21_pthonce.vgtest tc21_pthonce.stdout.exp \
tc21_pthonce.stderr.exp-glibc25-amd64 \
- tc21_pthonce.stderr.exp-glibc25-x86
+ tc21_pthonce.stderr.exp-glibc25-x86 \
+ tc22_exit_w_lock.vgtest tc22_exit_w_lock.stdout.exp \
+ tc22_exit_w_lock.stderr.exp-glibc25-amd64
check_PROGRAMS = \
hg01_all_ok \
@@ -104,7 +106,8 @@
tc18_semabuse \
tc19_shadowmem \
tc20_verifywrap \
- tc21_pthonce
+ tc21_pthonce \
+ tc22_exit_w_lock
AM_CPPFLAGS = -I$(top_srcdir) -I$(top_srcdir)/include \
-I$(top_srcdir)/coregrind -I$(top_builddir)/include \
Added: branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.c
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.c (rev 0)
+++ branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.c 2007-11-04 01:51:04 UTC (rev 7087)
@@ -0,0 +1,50 @@
+
+#include <pthread.h>
+#include <unistd.h>
+#include <assert.h>
+#include <signal.h>
+
+/* Should see 3 threads exiting in different ways, all holding one (or
+ two) locks. */
+
+pthread_mutex_t mxC1 = PTHREAD_MUTEX_INITIALIZER;
+pthread_mutex_t mxC2 = PTHREAD_MUTEX_INITIALIZER;
+pthread_mutex_t mxC2b = PTHREAD_MUTEX_INITIALIZER;
+pthread_mutex_t mxP = PTHREAD_MUTEX_INITIALIZER;
+
+/* This one exits in the normal way, by joining back */
+void* child_fn1 ( void* arg )
+{
+ int r= pthread_mutex_lock( &mxC1 ); assert(!r);
+ return NULL;
+}
+
+/* This one detaches, does its own thing. */
+void* child_fn2 ( void* arg )
+{
+ int r;
+ r= pthread_mutex_lock( &mxC2 ); assert(!r);
+ r= pthread_mutex_lock( &mxC2b ); assert(!r);
+ r= pthread_detach( pthread_self() ); assert(!r);
+ return NULL;
+}
+
+/* Parent creates 2 children, takes a lock, waits, segfaults. Use
+ sleeps to enforce exit ordering, for repeatable regtesting. */
+int main ( void )
+{
+ int r;
+ pthread_t child1, child2;
+
+ r= pthread_create(&child2, NULL, child_fn2, NULL); assert(!r);
+ sleep(1);
+
+ r= pthread_create(&child1, NULL, child_fn1, NULL); assert(!r);
+ r= pthread_join(child1, NULL); assert(!r);
+ sleep(1);
+
+ r= pthread_mutex_lock( &mxP );
+
+ kill( getpid(), SIGABRT );
+ return 0;
+}
Added: branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stderr.exp-glibc25-amd64
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stderr.exp-glibc25-amd64 (rev 0)
+++ branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stderr.exp-glibc25-amd64 2007-11-04 01:51:04 UTC (rev 7087)
@@ -0,0 +1,30 @@
+
+Thread #2 was created
+ at 0x........: clone (in /...libc...)
+ by 0x........: ...
+ by 0x........: pthread_create@GLIBC_ (in /lib/libpthread...)
+ by 0x........: pthread_create@* (tc_intercepts.c:...)
+ by 0x........: main (tc22_exit_w_lock.c:39)
+
+Thread #2: Exiting thread still holds 2 locks
+ at 0x........: start_thread (in /lib/libpthread...)
+ by 0x........: ...
+
+Thread #3 was created
+ at 0x........: clone (in /...libc...)
+ by 0x........: ...
+ by 0x........: pthread_create@GLIBC_ (in /lib/libpthread...)
+ by 0x........: pthread_create@* (tc_intercepts.c:...)
+ by 0x........: main (tc22_exit_w_lock.c:42)
+
+Thread #3: Exiting thread still holds 1 lock
+ at 0x........: start_thread (in /lib/libpthread...)
+ by 0x........: ...
+
+Thread #1 is the program's root thread
+
+Thread #1: Exiting thread still holds 1 lock
+ at 0x........: kill (in /...libc...)
+ by 0x........: main (tc22_exit_w_lock.c:48)
+
+ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
Added: branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.stdout.exp
===================================================================
Added: branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.vgtest
===================================================================
--- branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.vgtest (rev 0)
+++ branches/THRCHECK/thrcheck/tests/tc22_exit_w_lock.vgtest 2007-11-04 01:51:04 UTC (rev 7087)
@@ -0,0 +1 @@
+prog: tc22_exit_w_lock
|
|
From: <js...@ac...> - 2007-11-04 01:19:00
|
Nightly build on g5 ( SuSE 10.1, ppc970 ) started at 2007-11-04 02:00:01 CET Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 255 tests, 11 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/deep_templates (stdout) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/pointer-trace (stderr) massif/tests/culling1 (stderr) massif/tests/culling2 (stderr) massif/tests/deep-C (stderr) massif/tests/peak2 (stderr) massif/tests/realloc (stderr) none/tests/faultstatus (stderr) none/tests/fdleak_cmsg (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) |