You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
1
(1) |
2
(2) |
3
|
4
(1) |
5
(6) |
6
|
|
7
(1) |
8
|
9
(1) |
10
(2) |
11
(6) |
12
(3) |
13
(3) |
|
14
|
15
(11) |
16
(8) |
17
(5) |
18
(5) |
19
(5) |
20
(3) |
|
21
(2) |
22
(4) |
23
(5) |
24
(4) |
25
|
26
|
27
|
|
28
(8) |
|
|
|
|
|
|
|
From: Konstantin S. <kon...@gm...> - 2010-02-07 18:53:20
|
On Fri, Feb 5, 2010 at 8:00 PM, Julian Seward <js...@ac...> wrote:
>
> The log is quite useful. It might be that there is a race
> between the handling for sys_clone and for sys_exit_group. I'm not
> sure I understand the details though.
>
> sys_exit_group happens when the main thread exits. It marks
> all other threads in the same thread group as "to be forced
> to exit". If any of these threads are blocked in syscalls
> then they are hit on the head with sigvgkill to get them out
> of the syscall. Or something like that. (see function
> PRE_(sys_exit_group)).
>
> So, I suspect the problem is, there is a child thread
> that has just been created by clone
> (by a call to do_syscall_clone_amd64_linux)
> but which is not yet marked
> as being in the same thread group as its parent
> (which happens a few hundred instructions after the child's
> starup, in thread_wrapper (called by run_a_thread_NORETURN called
> by ML_(start_thread_NORETURN), which is the start point
> for the child on the host cpu).
>
> Then the parent exits, but the child is not marked as also-to-exit
> because it is not marked as in the same thread group as
> its parent. So it stays alive. This is I think what happened
> to tid=281 in the logfile you sent.
>
> It would be best to mark the child's thread group before
> creating it. But I don't understand the meaning of thread groups,
> and how these relate to what VG_(gettid) and VG_(getpid) return.
>
> I could chase this if you can refine the test case into something
> that reliably hangs every time -- the current 5% failure rate is going to
> make it impossible to investigate.
>
> One thing you could do is to insert a spin-wait loop in
>
Indeed, the patch below make the bug manifeest itself every time.
The process either hangs (top shows it as zombie) or continues to print
stuff forever.
--kcc
--- coregrind/m_syswrap/syswrap-linux.c (revision 11037)
+++ coregrind/m_syswrap/syswrap-linux.c (working copy)
@@ -214,11 +214,20 @@
vg_assert(0);
}
+static void spin_loop(int c, int tid) {
+ static volatile int z;
+ VG_(printf)("spinning: %d\n", tid);
+ while(c--) {
+ z++;
+ }
+ VG_(printf)("done: %d\n", tid);
+}
+
Word ML_(start_thread_NORETURN) ( void* arg )
{
ThreadState* tst = (ThreadState*)arg;
ThreadId tid = tst->tid;
-
+ spin_loop(1 << 25, tid);
run_a_thread_NORETURN ( (Word)tid );
/*NOTREACHED*/
vg_assert(0);
> ML_(start_thread_noreturn) [make sure gcc doesn't just optimise it
> away] to delay the point where the child sets up its .threadgroup
> field. This might make the hang happen more often. Can you try that?
>
> J
>
> On Tuesday 02 February 2010, Konstantin Serebryany wrote:
> > Hi Julian,
> >
> > Any luck with this hang?
> > Anything I can help with?
> >
> > --kcc
> >
> > On Thu, Jan 28, 2010 at 10:37 AM, Konstantin Serebryany <
> >
> > kon...@gm...> wrote:
> > > Sent a log off list
> > > With logging on it does not really want to hang.
> > > Instead (with ~5% probability) it loops forever.
> > > I think this is the same bug -- the process misses its own death
> time...
> > >
> > > --kcc
> > >
> > > On Thu, Jan 28, 2010 at 10:40 AM, Julian Seward <js...@ac...>
> wrote:
> > >> On Wednesday 27 January 2010, Julian Seward wrote:
> > >> > On Wednesday 27 January 2010, Konstantin Serebryany wrote:
> > >> > > I've minimized the problem to a small test (below).
> > >> > > It spawns many threads and doesn't join them before exiting.
> > >> > > It will hang (or loop forever) one out of 40-100 runs:
> > >> > > % g++ -g -lpthread hang.cc
> > >> > > % for((i=10;i<=99;i++)); do date; time
> > >>
> > >> ~/valgrind/trunk/inst/bin/valgrind
> > >>
> > >> > > --tool=none --trace-syscalls=yes --trace-signals=yes -q ./a.out
> 2>
> > >> > > $i.log ; done
> > >> >
> > >> > Ok; managed to reproduce it. 2 threads were still stuck in some
> > >> > syscall (don't know which yet). Investigating.
> > >>
> > >> I can reproduce it, but only in the case where there is no logging,
> > >> which isn't useful. If you have a logfile where it hangs for
> > >> --trace-syscalls=yes --trace-signals=yes, can you compress it and
> > >> send it to me? afaics the log is about 40MB long, but it should
> > >> bzip2 nicely.
> > >>
> > >> J
>
>
>
|
|
From: Julian S. <js...@ac...> - 2010-02-05 16:42:05
|
The log is quite useful. It might be that there is a race between the handling for sys_clone and for sys_exit_group. I'm not sure I understand the details though. sys_exit_group happens when the main thread exits. It marks all other threads in the same thread group as "to be forced to exit". If any of these threads are blocked in syscalls then they are hit on the head with sigvgkill to get them out of the syscall. Or something like that. (see function PRE_(sys_exit_group)). So, I suspect the problem is, there is a child thread that has just been created by clone (by a call to do_syscall_clone_amd64_linux) but which is not yet marked as being in the same thread group as its parent (which happens a few hundred instructions after the child's starup, in thread_wrapper (called by run_a_thread_NORETURN called by ML_(start_thread_NORETURN), which is the start point for the child on the host cpu). Then the parent exits, but the child is not marked as also-to-exit because it is not marked as in the same thread group as its parent. So it stays alive. This is I think what happened to tid=281 in the logfile you sent. It would be best to mark the child's thread group before creating it. But I don't understand the meaning of thread groups, and how these relate to what VG_(gettid) and VG_(getpid) return. I could chase this if you can refine the test case into something that reliably hangs every time -- the current 5% failure rate is going to make it impossible to investigate. One thing you could do is to insert a spin-wait loop in ML_(start_thread_noreturn) [make sure gcc doesn't just optimise it away] to delay the point where the child sets up its .threadgroup field. This might make the hang happen more often. Can you try that? J On Tuesday 02 February 2010, Konstantin Serebryany wrote: > Hi Julian, > > Any luck with this hang? > Anything I can help with? > > --kcc > > On Thu, Jan 28, 2010 at 10:37 AM, Konstantin Serebryany < > > kon...@gm...> wrote: > > Sent a log off list > > With logging on it does not really want to hang. > > Instead (with ~5% probability) it loops forever. > > I think this is the same bug -- the process misses its own death time... > > > > --kcc > > > > On Thu, Jan 28, 2010 at 10:40 AM, Julian Seward <js...@ac...> wrote: > >> On Wednesday 27 January 2010, Julian Seward wrote: > >> > On Wednesday 27 January 2010, Konstantin Serebryany wrote: > >> > > I've minimized the problem to a small test (below). > >> > > It spawns many threads and doesn't join them before exiting. > >> > > It will hang (or loop forever) one out of 40-100 runs: > >> > > % g++ -g -lpthread hang.cc > >> > > % for((i=10;i<=99;i++)); do date; time > >> > >> ~/valgrind/trunk/inst/bin/valgrind > >> > >> > > --tool=none --trace-syscalls=yes --trace-signals=yes -q ./a.out 2> > >> > > $i.log ; done > >> > > >> > Ok; managed to reproduce it. 2 threads were still stuck in some > >> > syscall (don't know which yet). Investigating. > >> > >> I can reproduce it, but only in the case where there is no logging, > >> which isn't useful. If you have a logfile where it hangs for > >> --trace-syscalls=yes --trace-signals=yes, can you compress it and > >> send it to me? afaics the log is about 40MB long, but it should > >> bzip2 nicely. > >> > >> J |
|
From: Julian S. <js...@ac...> - 2010-02-05 15:36:02
|
Looks ok to me. J On Tuesday 02 February 2010, Konstantin Serebryany wrote: > I submitted the new barrier annotations, please check. > http://code.google.com/p/data-race-test/source/browse/trunk/dynamic_annotat >ions/dynamic_annotations.h#276 > > --kcc > > On Fri, Jan 29, 2010 at 2:23 PM, Konstantin Serebryany < > > kon...@gm...> wrote: > > PTAL (==please take a[nother] look) > > http://codereview.appspot.com/196059/patch/11/1005 > > > > On Fri, Jan 29, 2010 at 2:12 PM, Bart Van Assche <bva...@ac...>wrote: > >> On Fri, Jan 29, 2010 at 12:06 PM, Konstantin Serebryany > >> > >> <kon...@gm...> wrote: > >> > On Fri, Jan 29, 2010 at 2:09 PM, Julian Seward <js...@ac...> wrote: > >> >> > * Some barrier implementations (e.g. the one in libgomp) allow > >> > >> barrier > >> > >> >> > reinitialization while others (e.g. POSIX threads) do not allow > >> >> > this. If we want threading tools to be able to complain about > >> >> > barrier reinitialization for barrier types for which this is not > >> >> > allowed we need a third argument for ANNOTATE_BARRIER_INIT() that > >> >> > tells the tool whether or not reinitialization is allowed. > >> >> > >> >> Yes, +1 for that. > >> > > >> > So, what would be the code like? > >> > /* Report that the "barrier" has been initialized with initial > >> > >> "count". > >> > >> > If allow_reinitialization is true, barrier_init() is allowed to be > >> > >> called > >> > >> > multiple times > >> > w/o calling barrier_destroy() */ > >> > #define ANNOTATE_BARRIER_INIT(barrier, count, > >> > allow_reinitialization) ? > >> > >> Maybe "reinitialization_allowed" instead of "allow_reinitialization" ? > >> > >> Bart. |
|
From: Alexander P. <gl...@go...> - 2010-02-05 10:13:55
|
Nightly build on mcgrind ( Darwin 9.7.0 i386 ) Started at 2010-02-05 09:06:02 MSK Ended at 2010-02-05 09:25:40 MSK Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 433 tests, 22 stderr failures, 1 stdout failure, 0 post failures == memcheck/tests/null_socket (stdout) memcheck/tests/origin5-bz2 (stderr) memcheck/tests/varinfo1 (stderr) memcheck/tests/varinfo2 (stderr) memcheck/tests/varinfo3 (stderr) memcheck/tests/varinfo4 (stderr) memcheck/tests/varinfo5 (stderr) memcheck/tests/varinfo6 (stderr) none/tests/async-sigs (stderr) none/tests/faultstatus (stderr) none/tests/pth_blockedsig (stderr) helgrind/tests/hg03_inherit (stderr) helgrind/tests/hg04_race (stderr) helgrind/tests/hg05_race2 (stderr) helgrind/tests/rwlock_race (stderr) helgrind/tests/tc01_simple_race (stderr) helgrind/tests/tc05_simple_race (stderr) helgrind/tests/tc06_two_races (stderr) helgrind/tests/tc06_two_races_xml (stderr) helgrind/tests/tc16_byterace (stderr) helgrind/tests/tc18_semabuse (stderr) helgrind/tests/tc21_pthonce (stderr) helgrind/tests/tc23_bogus_condwait (stderr) -- Alexander Potapenko Software Engineer Google Moscow |
|
From: Bart V. A. <bar...@gm...> - 2010-02-05 07:09:36
|
Nightly build on cellbuzz-native ( cellbuzz, ppc64, Fedora 7, native ) Started at 2010-02-05 02:00:05 EST Ended at 2010-02-05 02:09:22 EST Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... done Last 20 lines of verbose log follow echo Makefile.vex.am: installing `./depcomp' running: autoconf checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether to enable maintainer-specific portions of Makefiles... no checking whether ln -s works... yes checking for gcc... gcc checking for C compiler default output file name... configure: error: in `/home/bart/software/valgrind/nightly/valgrind-new': configure: error: C compiler cannot create executables See `config.log' for more details. Building valgrind ... cd valgrind-new && make -j 2 && make -j 2 check && make install Job ID = 386.cell-user.cell.buzz make: *** No targets specified and no makefile found. Stop. Running regression tests ... cd valgrind-new && make regtest Job ID = 387.cell-user.cell.buzz make: *** No rule to make target `regtest'. Stop. ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... done Last 20 lines of verbose log follow echo Makefile.vex.am: installing `./depcomp' running: autoconf checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether to enable maintainer-specific portions of Makefiles... no checking whether ln -s works... yes checking for gcc... gcc checking for C compiler default output file name... configure: error: in `/home/bart/software/valgrind/nightly/valgrind-old': configure: error: C compiler cannot create executables See `config.log' for more details. Building valgrind ... cd valgrind-old && make -j 2 && make -j 2 check && make install Job ID = 382.cell-user.cell.buzz make: *** No targets specified and no makefile found. Stop. Running regression tests ... cd valgrind-old && make regtest Job ID = 383.cell-user.cell.buzz make: *** No rule to make target `regtest'. Stop. ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Fri Feb 5 02:04:30 2010 --- new.short Fri Feb 5 02:09:22 2010 *************** *** 18,27 **** checking for C compiler default output file name... ! configure: error: in `/home/bart/software/valgrind/nightly/valgrind-old': configure: error: C compiler cannot create executables See `config.log' for more details. ! Building valgrind ... cd valgrind-old && make -j 2 && make -j 2 check && make install ! Job ID = 382.cell-user.cell.buzz make: *** No targets specified and no makefile found. Stop. ! Running regression tests ... cd valgrind-old && make regtest ! Job ID = 383.cell-user.cell.buzz make: *** No rule to make target `regtest'. Stop. --- 18,27 ---- checking for C compiler default output file name... ! configure: error: in `/home/bart/software/valgrind/nightly/valgrind-new': configure: error: C compiler cannot create executables See `config.log' for more details. ! Building valgrind ... cd valgrind-new && make -j 2 && make -j 2 check && make install ! Job ID = 386.cell-user.cell.buzz make: *** No targets specified and no makefile found. Stop. ! Running regression tests ... cd valgrind-new && make regtest ! Job ID = 387.cell-user.cell.buzz make: *** No rule to make target `regtest'. Stop. |
|
From: Tom H. <th...@cy...> - 2010-02-05 03:49:55
|
Nightly build on lloyd ( x86_64, Fedora 7 ) Started at 2010-02-05 03:05:04 GMT Ended at 2010-02-05 03:49:42 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 531 tests, 1 stderr failure, 0 stdout failures, 0 post failures == helgrind/tests/tc06_two_races_xml (stderr) |
|
From: Tom H. <th...@cy...> - 2010-02-05 03:36:10
|
Nightly build on mg ( x86_64, Fedora 9 ) Started at 2010-02-05 03:10:05 GMT Ended at 2010-02-05 03:35:54 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 538 tests, 1 stderr failure, 0 stdout failures, 0 post failures == helgrind/tests/tc06_two_races_xml (stderr) |
|
From: <sv...@va...> - 2010-02-04 12:04:23
|
Author: sewardj
Date: 2010-02-04 12:04:14 +0000 (Thu, 04 Feb 2010)
New Revision: 11037
Log:
Generalise a suppression w.r.t. __setenv on Darwin.
Modified:
trunk/darwin9.supp
Modified: trunk/darwin9.supp
===================================================================
--- trunk/darwin9.supp 2010-01-30 13:40:27 UTC (rev 11036)
+++ trunk/darwin9.supp 2010-02-04 12:04:14 UTC (rev 11037)
@@ -103,7 +103,7 @@
macos-Cond-7
Memcheck:Cond
fun:__setenv
- fun:putenv
+ fun:putenv*
}
{
|
|
From: Konstantin S. <kon...@gm...> - 2010-02-02 13:24:43
|
Hi Julian, Any luck with this hang? Anything I can help with? --kcc On Thu, Jan 28, 2010 at 10:37 AM, Konstantin Serebryany < kon...@gm...> wrote: > Sent a log off list > With logging on it does not really want to hang. > Instead (with ~5% probability) it loops forever. > I think this is the same bug -- the process misses its own death time... > > --kcc > > On Thu, Jan 28, 2010 at 10:40 AM, Julian Seward <js...@ac...> wrote: > >> On Wednesday 27 January 2010, Julian Seward wrote: >> > On Wednesday 27 January 2010, Konstantin Serebryany wrote: >> > > I've minimized the problem to a small test (below). >> > > It spawns many threads and doesn't join them before exiting. >> > > It will hang (or loop forever) one out of 40-100 runs: >> > > % g++ -g -lpthread hang.cc >> > > % for((i=10;i<=99;i++)); do date; time >> ~/valgrind/trunk/inst/bin/valgrind >> > > --tool=none --trace-syscalls=yes --trace-signals=yes -q ./a.out 2> >> > > $i.log ; done >> > >> > Ok; managed to reproduce it. 2 threads were still stuck in some syscall >> > (don't know which yet). Investigating. >> >> I can reproduce it, but only in the case where there is no logging, >> which isn't useful. If you have a logfile where it hangs for >> --trace-syscalls=yes --trace-signals=yes, can you compress it and >> send it to me? afaics the log is about 40MB long, but it should >> bzip2 nicely. >> >> J >> > > |
|
From: Konstantin S. <kon...@gm...> - 2010-02-02 13:15:56
|
I submitted the new barrier annotations, please check. http://code.google.com/p/data-race-test/source/browse/trunk/dynamic_annotations/dynamic_annotations.h#276 --kcc On Fri, Jan 29, 2010 at 2:23 PM, Konstantin Serebryany < kon...@gm...> wrote: > PTAL (==please take a[nother] look) > http://codereview.appspot.com/196059/patch/11/1005 > > > On Fri, Jan 29, 2010 at 2:12 PM, Bart Van Assche <bva...@ac...>wrote: > >> On Fri, Jan 29, 2010 at 12:06 PM, Konstantin Serebryany >> <kon...@gm...> wrote: >> > >> > >> > On Fri, Jan 29, 2010 at 2:09 PM, Julian Seward <js...@ac...> wrote: >> >> >> >> > * Some barrier implementations (e.g. the one in libgomp) allow >> barrier >> >> > reinitialization while others (e.g. POSIX threads) do not allow this. >> >> > If we want threading tools to be able to complain about barrier >> >> > reinitialization for barrier types for which this is not allowed we >> >> > need a third argument for ANNOTATE_BARRIER_INIT() that tells the tool >> >> > whether or not reinitialization is allowed. >> >> >> >> Yes, +1 for that. >> > >> > So, what would be the code like? >> > /* Report that the "barrier" has been initialized with initial >> "count". >> > If allow_reinitialization is true, barrier_init() is allowed to be >> called >> > multiple times >> > w/o calling barrier_destroy() */ >> > #define ANNOTATE_BARRIER_INIT(barrier, count, allow_reinitialization) >> > ? >> >> Maybe "reinitialization_allowed" instead of "allow_reinitialization" ? >> >> Bart. >> > > |
|
From: Kirill B. <bat...@is...> - 2010-02-01 17:17:18
|
Hi all, while trying to make Valgrind runnable on v5 and v6 ARM processors I've encountered bug in set_tls handling which I do not know how to fix properly. TLS value may be stored in CP15's register or at address 0xffff0ff0. The second case is handled incorrectly by Valgrind since revision r10973 in branches/ARM lately merged into trunk ( http://sourceforge.net/mailarchive/message.php?msg_name=20091229170034.6CD73108845%40jail0086.vps.exonetric.net ). Here is this syscall's code from 2.6 kernel. arch/arm/kernel/traps.c: case NR(set_tls): thread->tp_value = regs->ARM_r0; #if defined(CONFIG_HAS_TLS_REG) asm ("mcr p15, 0, %0, c13, c0, 3" : : "r" (regs->ARM_r0) ); #elif !defined(CONFIG_TLS_REG_EMUL) /* * User space must never try to access this directly. * Expect your app to break eventually if you do so. * The user helper at 0xffff0fe0 must be used instead. * (see entry-armv.S for details) */ *((unsigned int *)0xffff0ff0) = regs->ARM_r0; #endif return 0; And in arch/arm/kernel/entry-armv.S: __kuser_get_tls: @ 0xffff0fe0 #if !defined(CONFIG_HAS_TLS_REG) && !defined(CONFIG_TLS_REG_EMUL) ldr r0, [pc, #(16 - 8)] @ TLS stored at 0xffff0ff0 #else mrc p15, 0, r0, c13, c0, 3 @ read TLS register #endif usr_ret lr The reason of incorrect behavior is that set_tls syscall is never passed to the kernel. It is handled by Valgrind internally. As a result memory at address 0xffff0ff0 is not being written, ldr r0, [pc, #(16 - 8)] reads incorrect value and program crashes when it tries to dereference obtained pointer. Since writing blindly to 0xffff0ff0 does not seem to be a good idea this bug can be fixed by passing set_tls syscall to kernel. I created a small patch doing this but I'm not sure about two things: 1. Should Valrgind be told about possible memory write in PRE wrapper for set_tls? 2. TLS can be also set in clone syscall. It is not passed to kernel either as I understood. Should set_tls syscall be passed to kernel from do_clone function in this case? Or is there a better way of setting TLS in this case? Currently none of these 2 things are done. As a result, pth_cancel1 and pth_cancel2 from Nullgrind's regression tests hung. Can anybody help me to deal with these two problems? Thanks, Kirill. |