You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
1
(9) |
|
2
(12) |
3
(19) |
4
(18) |
5
(22) |
6
(25) |
7
(18) |
8
(24) |
|
9
(16) |
10
(15) |
11
(22) |
12
(7) |
13
(19) |
14
(5) |
15
(8) |
|
16
(7) |
17
(8) |
18
(9) |
19
(7) |
20
(13) |
21
(16) |
22
(7) |
|
23
(10) |
24
(8) |
25
(4) |
26
(4) |
27
(9) |
28
(4) |
29
(5) |
|
30
(8) |
31
(4) |
|
|
|
|
|
|
From: Julian S. <js...@ac...> - 2007-12-07 23:58:04
|
> I think it is a problem with the fast comparison. > I've reproduced the bug, Coolness. How did you manage that? J |
|
From: Nicholas N. <nj...@cs...> - 2007-12-07 23:52:53
|
On Fri, 7 Dec 2007, Julian Seward wrote: > I had a brief check through m_oset.c, looking for word size and signedness > issues to do with fast-case comparisons of keys (as is used here), but > saw nothing suspicious. I think it is a problem with the fast comparison. I've reproduced the bug, and when I added an explicit slow comparison function, it behaves correctly. I'll keep looking... N |
|
From: Julian S. <js...@ac...> - 2007-12-07 18:20:45
|
> I've now made set_sec_vbits8 dump the tree before and after setting > the line and it looks very simple: > > 0xFEC7CD30 > 0xFEC7CD70 > --11681-- setting line 0x75D0EA0 > 0xFEC7CD30 > 0xFEC7CD70 > 0x75D0EA0 > > So we have a tree with 2 notes, insert a third and get an unordered tree > out. Sheesh. That's very strange, because the AVL stuff has been intensively hammered on since it was installed, well over a year ago. Can you send your tree-check routine? I am curious to use it in mc_expensive_sanity_check to see if it shows the problem happening but undetected on any other programs. I had a brief check through m_oset.c, looking for word size and signedness issues to do with fast-case comparisons of keys (as is used here), but saw nothing suspicious. J |
|
From: Tom H. <to...@co...> - 2007-12-07 15:15:36
|
On Dec 7, 2007 3:01 PM, Tom Hughes <to...@co...> wrote: > It seems that the problem is that the AVL tree is getting out of > order. I made get_sec_vbits8 walk the oset when it detects the problem > and dump the addresses to the log and this is what I get: > > 0x28D448D0 > 0x28D44950 > ... > 0x28E81BF0 > 0x28E8C910 > 0x7F22ECE0 > 0x7F22ED30 > ... > 0x7F22F930 > 0x7F22F960 > 0xFEDD6D30 > 0xFEDD6D70 > 0x75D0EA0 > > At that point it stopped as it noticed the address going backwards. I've now made set_sec_vbits8 dump the tree before and after setting the line and it looks very simple: 0xFEC7CD30 0xFEC7CD70 --11681-- setting line 0x75D0EA0 0xFEC7CD30 0xFEC7CD70 0x75D0EA0 So we have a tree with 2 notes, insert a third and get an unordered tree out. Somehow VG_(OSetGen_Lookup) still works at that point though... Tom -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: Tom H. <to...@co...> - 2007-12-07 15:01:29
|
On Dec 7, 2007 2:06 PM, Tom Hughes <to...@co...> wrote:
> On Dec 7, 2007 12:53 AM, Julian Seward <js...@ac...> wrote:
>
> > That means, either:
> >
> > 1. no entry was ever made for "a"
> > (really, for VG_ROUNDDN(a, BYTES_PER_SEC_VBIT_NODE)), or
> >
> > 2. there was an entry, but it has since been deleted, or
> >
> > 3. some other snafu.
> >
> > Let's chase (1) first: in set_sec_vbits8 I'd add
> > VG_(printf)("setting line %p\n", aAligned)
> > let it run, presumably accumulating a large log file. When it borks,
> > have a look in the log file, to see if the aAligned causing the assertion in
> > get_sec_vbits8 was actually entered in the first place. Yell if that
> > don't make sense.
>
> Done that, and it looks like it is being created - first we get this:
>
> --31740-- setting line 0x75D0EA0
>
> and then a bit later this:
>
> Memcheck: mc_main.c:959 (get_sec_vbits8): Assertion 'n' failed.
> Memcheck: get_sec_vbits8: no node for address 0x75D0EA0 (0x75D0EAC)
>
> ==31740== at 0x3801444D: report_and_quit (m_libcassert.c:140)
>
> > Hmm, on rereading previous messages, all of (2) is irrelevant if
> > you disabled gcSecVBitTable and the problem still exists. So
> > it's either 1. or 3. Can you at least try 1. ?
>
> The GC is disabled, so it shouldn't be that..
>
> It's getting odder and odder really.
It seems that the problem is that the AVL tree is getting out of
order. I made get_sec_vbits8 walk the oset when it detects the problem
and dump the addresses to the log and this is what I get:
0x28D448D0
0x28D44950
...
0x28E81BF0
0x28E8C910
0x7F22ECE0
0x7F22ED30
...
0x7F22F930
0x7F22F960
0xFEDD6D30
0xFEDD6D70
0x75D0EA0
At that point it stopped as it noticed the address going backwards.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Christoph B. <bar...@or...> - 2007-12-07 14:15:30
|
Am Freitag, 7. Dezember 2007 schrieb Tom Hughes: > > Done that, and it looks like it is being created - first we get this: > > --31740-- setting line 0x75D0EA0 > > and then a bit later this: > > Memcheck: mc_main.c:959 (get_sec_vbits8): Assertion 'n' failed. > Memcheck: get_sec_vbits8: no node for address 0x75D0EA0 (0x75D0EAC) > > ==31740== at 0x3801444D: report_and_quit (m_libcassert.c:140) > > > Hmm, on rereading previous messages, all of (2) is irrelevant if > > you disabled gcSecVBitTable and the problem still exists. So > > it's either 1. or 3. Can you at least try 1. ? > > The GC is disabled, so it shouldn't be that.. > > It's getting odder and odder really. You could run valgrind in the debugger. And set a watchpoint to the node that holds the address 0x75D0EA0 and its parent in the AVL tree. At some point the link from the root node to the node seems to get lost. This only works if the address is constant between two runs of valgrind. If this is the case you could also search for the address in the AVL tree after all modification steps and use binary search over time to find the location where it gets lost. Christoph |
|
From: Tom H. <to...@co...> - 2007-12-07 14:06:29
|
On Dec 7, 2007 12:53 AM, Julian Seward <js...@ac...> wrote:
> That means, either:
>
> 1. no entry was ever made for "a"
> (really, for VG_ROUNDDN(a, BYTES_PER_SEC_VBIT_NODE)), or
>
> 2. there was an entry, but it has since been deleted, or
>
> 3. some other snafu.
>
> Let's chase (1) first: in set_sec_vbits8 I'd add
> VG_(printf)("setting line %p\n", aAligned)
> let it run, presumably accumulating a large log file. When it borks,
> have a look in the log file, to see if the aAligned causing the assertion in
> get_sec_vbits8 was actually entered in the first place. Yell if that
> don't make sense.
Done that, and it looks like it is being created - first we get this:
--31740-- setting line 0x75D0EA0
and then a bit later this:
Memcheck: mc_main.c:959 (get_sec_vbits8): Assertion 'n' failed.
Memcheck: get_sec_vbits8: no node for address 0x75D0EA0 (0x75D0EAC)
==31740== at 0x3801444D: report_and_quit (m_libcassert.c:140)
> Hmm, on rereading previous messages, all of (2) is irrelevant if
> you disabled gcSecVBitTable and the problem still exists. So
> it's either 1. or 3. Can you at least try 1. ?
The GC is disabled, so it shouldn't be that..
It's getting odder and odder really.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Julian S. <js...@ac...> - 2007-12-07 11:40:34
|
On Thursday 06 December 2007 09:53, Konstantin Serebryany wrote:
> I've modified my test (attached, q2.cc), hope it will be helpful :)
> It now has N worker threads. If N >= 2 the race is reported even for GLOB1.
Last night's patch (happens-before-cvhack.patch) also makes this, q2.cc,
run without race warnings.
> I did not find VALGRIND_HG_POST_WAIT anywhere in valgrind nor in the net.
> Is it supposed to be used like this?
>
> #include "helgrind.h"
> ...
> pthread_mutex_lock(&MU);
> while (COND != n_tasks) {
> pthread_cond_wait(&CV, &MU);
> }
> VALGRIND_HG_POST_WAIT(&CV)
> pthread_mutex_unlock(&MU);
What is the behaviour of VALGRIND_HG_POST_WAIT supposed to be?
Are you referring to the VALGRIND_HG_POST_WAIT that is introduced
at the bottom of page 31 of Arndt Muehlenfeld's PhD thesis, or some
other thing?
J
|
|
From: Konstantin S. <kon...@gm...> - 2007-12-07 10:30:43
|
>> Last night's patch (happens-before-cvhack.patch) also makes this, q2.cc,
>> run without race warnings.
Amazing! --happens-before=cvhack does help with this patch!
>From the comments it does look 'insanely inefficient', but it's better than
nothing. :)
I'll try other tests with cond vars.
>> Are you referring to the VALGRIND_HG_POST_WAIT that is introduced
>> at the bottom of page 31 of Arndt Muehlenfeld's PhD thesis, or some
>> other thing?
Can you give me a link to Arndt's PhD? I can't find it :(
--kcc
On Dec 7, 2007 12:44 PM, Julian Seward <js...@ac...> wrote:
>
> On Thursday 06 December 2007 09:53, Konstantin Serebryany wrote:
>
> > I've modified my test (attached, q2.cc), hope it will be helpful :)
> > It now has N worker threads. If N >= 2 the race is reported even for
> GLOB1.
>
> Last night's patch (happens-before-cvhack.patch) also makes this, q2.cc,
> run without race warnings.
>
>
> > I did not find VALGRIND_HG_POST_WAIT anywhere in valgrind nor in the
> net.
> > Is it supposed to be used like this?
> >
> > #include "helgrind.h"
> > ...
> > pthread_mutex_lock(&MU);
> > while (COND != n_tasks) {
> > pthread_cond_wait(&CV, &MU);
> > }
> > VALGRIND_HG_POST_WAIT(&CV)
> > pthread_mutex_unlock(&MU);
>
> What is the behaviour of VALGRIND_HG_POST_WAIT supposed to be?
>
> Are you referring to the VALGRIND_HG_POST_WAIT that is introduced
> at the bottom of page 31 of Arndt Muehlenfeld's PhD thesis, or some
> other thing?
>
> J
>
|
|
From: Tom H. <th...@cy...> - 2007-12-07 03:59:20
|
Nightly build on alvis ( i686, Red Hat 7.3 ) started at 2007-12-07 03:15:03 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 321 tests, 62 stderr failures, 1 stdout failure, 28 post failures == memcheck/tests/addressable (stderr) memcheck/tests/badjump (stderr) memcheck/tests/describe-block (stderr) memcheck/tests/erringfds (stderr) memcheck/tests/leak-0 (stderr) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-pool-0 (stderr) memcheck/tests/leak-pool-1 (stderr) memcheck/tests/leak-pool-2 (stderr) memcheck/tests/leak-pool-3 (stderr) memcheck/tests/leak-pool-4 (stderr) memcheck/tests/leak-pool-5 (stderr) memcheck/tests/leak-regroot (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/long_namespace_xml (stderr) memcheck/tests/malloc_free_fill (stderr) memcheck/tests/match-overrun (stderr) memcheck/tests/noisy_child (stderr) memcheck/tests/partial_load_dflt (stderr) memcheck/tests/partial_load_ok (stderr) memcheck/tests/partiallydefinedeq (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/sigkill (stderr) memcheck/tests/stack_changes (stderr) memcheck/tests/x86/bug152022 (stderr) memcheck/tests/x86/scalar (stderr) memcheck/tests/x86/scalar_supp (stderr) memcheck/tests/x86/xor-undef-x86 (stderr) memcheck/tests/xml1 (stderr) massif/tests/alloc-fns-A (post) massif/tests/alloc-fns-B (post) massif/tests/basic (post) massif/tests/basic2 (post) massif/tests/big-alloc (post) massif/tests/culling1 (stderr) massif/tests/culling2 (stderr) massif/tests/custom_alloc (post) massif/tests/deep-A (post) massif/tests/deep-B (stderr) massif/tests/deep-B (post) massif/tests/deep-C (stderr) massif/tests/deep-C (post) massif/tests/deep-D (post) massif/tests/ignoring (post) massif/tests/insig (post) massif/tests/long-time (post) massif/tests/new-cpp (post) massif/tests/null (post) massif/tests/one (post) massif/tests/overloaded-new (post) massif/tests/peak (post) massif/tests/peak2 (stderr) massif/tests/peak2 (post) massif/tests/realloc (stderr) massif/tests/realloc (post) massif/tests/thresholds_0_0 (post) massif/tests/thresholds_0_10 (post) massif/tests/thresholds_10_0 (post) massif/tests/thresholds_10_10 (post) massif/tests/thresholds_5_0 (post) massif/tests/thresholds_5_10 (post) massif/tests/zero1 (post) massif/tests/zero2 (post) none/tests/mremap (stderr) none/tests/mremap2 (stdout) helgrind/tests/hg01_all_ok (stderr) helgrind/tests/hg02_deadlock (stderr) helgrind/tests/hg03_inherit (stderr) helgrind/tests/hg04_race (stderr) helgrind/tests/hg05_race2 (stderr) helgrind/tests/hg06_readshared (stderr) helgrind/tests/tc01_simple_race (stderr) helgrind/tests/tc02_simple_tls (stderr) helgrind/tests/tc03_re_excl (stderr) helgrind/tests/tc05_simple_race (stderr) helgrind/tests/tc06_two_races (stderr) helgrind/tests/tc07_hbl1 (stderr) helgrind/tests/tc08_hbl2 (stderr) helgrind/tests/tc09_bad_unlock (stderr) helgrind/tests/tc11_XCHG (stderr) helgrind/tests/tc12_rwl_trivial (stderr) helgrind/tests/tc14_laog_dinphils (stderr) helgrind/tests/tc16_byterace (stderr) helgrind/tests/tc17_sembar (stderr) helgrind/tests/tc18_semabuse (stderr) helgrind/tests/tc19_shadowmem (stderr) helgrind/tests/tc20_verifywrap (stderr) helgrind/tests/tc21_pthonce (stderr) helgrind/tests/tc22_exit_w_lock (stderr) helgrind/tests/tc23_bogus_condwait (stderr) helgrind/tests/tc24_nonzero_sem (stderr) |
|
From: Tom H. <th...@cy...> - 2007-12-07 03:35:47
|
Nightly build on lloyd ( x86_64, Fedora 7 ) started at 2007-12-07 03:05:10 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 355 tests, 7 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/malloc_free_fill (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/vcpu_fnfns (stdout) memcheck/tests/x86/scalar (stderr) memcheck/tests/xml1 (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) helgrind/tests/tc20_verifywrap (stderr) helgrind/tests/tc22_exit_w_lock (stderr) |
|
From: Tom H. <th...@cy...> - 2007-12-07 03:27:37
|
Nightly build on dellow ( x86_64, Fedora 8 ) started at 2007-12-07 03:10:05 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 355 tests, 10 stderr failures, 3 stdout failures, 0 post failures == memcheck/tests/malloc_free_fill (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/vcpu_fnfns (stdout) memcheck/tests/x86/scalar (stderr) memcheck/tests/xml1 (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) none/tests/pth_detached (stdout) helgrind/tests/tc17_sembar (stderr) helgrind/tests/tc18_semabuse (stderr) helgrind/tests/tc20_verifywrap (stderr) helgrind/tests/tc22_exit_w_lock (stderr) helgrind/tests/tc23_bogus_condwait (stderr) |
|
From: Tom H. <th...@cy...> - 2007-12-07 03:14:28
|
Nightly build on gill ( x86_64, Fedora Core 2 ) started at 2007-12-07 03:00:02 GMT Results unchanged from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 357 tests, 24 stderr failures, 1 stdout failure, 0 post failures == memcheck/tests/malloc_free_fill (stderr) memcheck/tests/pointer-trace (stderr) memcheck/tests/stack_switch (stderr) memcheck/tests/x86/scalar (stderr) memcheck/tests/x86/scalar_supp (stderr) none/tests/fdleak_fcntl (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) helgrind/tests/hg01_all_ok (stderr) helgrind/tests/hg02_deadlock (stderr) helgrind/tests/hg03_inherit (stderr) helgrind/tests/hg04_race (stderr) helgrind/tests/hg05_race2 (stderr) helgrind/tests/tc01_simple_race (stderr) helgrind/tests/tc05_simple_race (stderr) helgrind/tests/tc06_two_races (stderr) helgrind/tests/tc09_bad_unlock (stderr) helgrind/tests/tc14_laog_dinphils (stderr) helgrind/tests/tc16_byterace (stderr) helgrind/tests/tc17_sembar (stderr) helgrind/tests/tc19_shadowmem (stderr) helgrind/tests/tc20_verifywrap (stderr) helgrind/tests/tc21_pthonce (stderr) helgrind/tests/tc22_exit_w_lock (stderr) helgrind/tests/tc23_bogus_condwait (stderr) |
|
From: Nicholas N. <nj...@cs...> - 2007-12-07 03:07:05
|
On Fri, 7 Dec 2007, Julian Seward wrote: > 1. no entry was ever made for "a" > (really, for VG_ROUNDDN(a, BYTES_PER_SEC_VBIT_NODE)), or > > 2. there was an entry, but it has since been deleted, or > > 3. some other snafu. > > Hmm, on rereading previous messages, all of (2) is irrelevant if > you disabled gcSecVBitTable and the problem still exists. So > it's either 1. or 3. Can you at least try 1. ? Hopefully it's 1. If it's 3, then the secVbits set/get calls match up, which (I think) means the secVBits table is getting corrupted somehow. Or maybe the set/get calls themselves are buggy? Nick |
|
From: Julian S. <js...@ac...> - 2007-12-07 02:04:58
|
On Wednesday 05 December 2007 18:10, Christoph Bartoschek wrote: > Am Mittwoch, 5. Dezember 2007 schrieb Julian Seward: > > On Wednesday 05 December 2007 16:32, Christoph Bartoschek wrote: > > > The read of COND in the parent thread happens in a segment that is > > > after the accesses that established the shared-modified state in the > > > happens-before graph. > > > > > > Given that COND should not trigger an error. > > > > But why should we assume that ownership of COND should change from > > shared-modified to exclusively-owned-by-parent at the point of the > > signal/wait pair? Ok. Try the attached patch; it implements this change. Once patched, you need the flag --happens-before=cvhack, else Helgrind behaves exactly as before. The transition it does, at the event _VG_USERREQ__HG_PTHREAD_COND_WAIT_POST, is: signalling thread gives up ownership of shared locations, giving them to the waiting thread instead. If a location was previously accessed only by the waiting thread and the signalling thread, then the waiting thread acquires exclusive ownership. It makes H shut up with Konstantin's original test case cv.cc. I did not test anything else. Note this is an insanely inefficient implementation -- this patch is just to find out if the idea is useful. J |
|
From: <js...@ac...> - 2007-12-07 01:23:27
|
Nightly build on g5 ( SuSE 10.1, ppc970 ) started at 2007-12-07 02:00:01 CET Results differ from 24 hours ago Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 288 tests, 27 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/deep_templates (stdout) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/malloc_free_fill (stderr) memcheck/tests/pointer-trace (stderr) none/tests/faultstatus (stderr) none/tests/fdleak_cmsg (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) helgrind/tests/hg02_deadlock (stderr) helgrind/tests/hg03_inherit (stderr) helgrind/tests/hg04_race (stderr) helgrind/tests/hg05_race2 (stderr) helgrind/tests/tc01_simple_race (stderr) helgrind/tests/tc05_simple_race (stderr) helgrind/tests/tc06_two_races (stderr) helgrind/tests/tc07_hbl1 (stderr) helgrind/tests/tc08_hbl2 (stderr) helgrind/tests/tc09_bad_unlock (stderr) helgrind/tests/tc11_XCHG (stderr) helgrind/tests/tc14_laog_dinphils (stderr) helgrind/tests/tc16_byterace (stderr) helgrind/tests/tc17_sembar (stderr) helgrind/tests/tc19_shadowmem (stderr) helgrind/tests/tc20_verifywrap (stderr) helgrind/tests/tc21_pthonce (stderr) helgrind/tests/tc22_exit_w_lock (stderr) helgrind/tests/tc23_bogus_condwait (stderr) helgrind/tests/tc24_nonzero_sem (stderr) ================================================= == Results from 24 hours ago == ================================================= Checking out valgrind source tree ... done Configuring valgrind ... done Building valgrind ... done Running regression tests ... failed Regression test results follow == 288 tests, 27 stderr failures, 4 stdout failures, 0 post failures == memcheck/tests/deep_templates (stdout) memcheck/tests/leak-cycle (stderr) memcheck/tests/leak-tree (stderr) memcheck/tests/malloc_free_fill (stderr) memcheck/tests/pointer-trace (stderr) none/tests/faultstatus (stderr) none/tests/fdleak_cmsg (stderr) none/tests/mremap (stderr) none/tests/mremap2 (stdout) none/tests/res_search (stdout) helgrind/tests/hg02_deadlock (stderr) helgrind/tests/hg03_inherit (stderr) helgrind/tests/hg04_race (stderr) helgrind/tests/hg05_race2 (stderr) helgrind/tests/tc01_simple_race (stderr) helgrind/tests/tc05_simple_race (stderr) helgrind/tests/tc06_two_races (stderr) helgrind/tests/tc07_hbl1 (stderr) helgrind/tests/tc08_hbl2 (stdout) helgrind/tests/tc08_hbl2 (stderr) helgrind/tests/tc09_bad_unlock (stderr) helgrind/tests/tc11_XCHG (stderr) helgrind/tests/tc14_laog_dinphils (stderr) helgrind/tests/tc16_byterace (stderr) helgrind/tests/tc17_sembar (stderr) helgrind/tests/tc19_shadowmem (stderr) helgrind/tests/tc20_verifywrap (stderr) helgrind/tests/tc21_pthonce (stderr) helgrind/tests/tc22_exit_w_lock (stderr) helgrind/tests/tc23_bogus_condwait (stderr) helgrind/tests/tc24_nonzero_sem (stderr) ================================================= == Difference between 24 hours ago and now == ================================================= *** old.short Fri Dec 7 02:12:26 2007 --- new.short Fri Dec 7 02:23:25 2007 *************** *** 8,10 **** ! == 288 tests, 27 stderr failures, 4 stdout failures, 0 post failures == memcheck/tests/deep_templates (stdout) --- 8,10 ---- ! == 288 tests, 27 stderr failures, 2 stdout failures, 0 post failures == memcheck/tests/deep_templates (stdout) *************** *** 18,20 **** none/tests/mremap2 (stdout) - none/tests/res_search (stdout) helgrind/tests/hg02_deadlock (stderr) --- 18,19 ---- *************** *** 27,29 **** helgrind/tests/tc07_hbl1 (stderr) - helgrind/tests/tc08_hbl2 (stdout) helgrind/tests/tc08_hbl2 (stderr) --- 26,27 ---- |
|
From: Julian S. <js...@ac...> - 2007-12-07 00:54:39
|
Here's some background. Apologies if you know this already.
The "secondary V bits table" holds V (definedness) bits for selected
few parts of the process' address space. Just the parts of the
address space where bytes are partially defined, that is, neither
completely undefined nor completely defined. There are relatively
few of these.
The table (secVBitTable) is actually an OSet, essentially an AVL
tree which maps guest addresses to the V bits for that address.
Because it would be rather wasteful of space to have one tree node
for each partially defined byte in the address space, instead
each node contains the definedness data for BYTES_PER_SEC_VBIT_NODE
(16) bytes at a time. Accordingly the associated OSet key is
rounded down to the nearest 16 byte boundary.
Memcheck is bombing in "get_sec_vbits8(Addr a)" because, following
consultation of other data structures, it has determined that the
byte at "a" is partially defined, so it needs to look up in
said table, its exact definedness info. Problem is there is no
entry in the table.
That means, either:
1. no entry was ever made for "a"
(really, for VG_ROUNDDN(a, BYTES_PER_SEC_VBIT_NODE)), or
2. there was an entry, but it has since been deleted, or
3. some other snafu.
Let's chase (1) first: in set_sec_vbits8 I'd add
VG_(printf)("setting line %p\n", aAligned)
let it run, presumably accumulating a large log file. When it borks,
have a look in the log file, to see if the aAligned causing the assertion in
get_sec_vbits8 was actually entered in the first place. Yell if that
don't make sense.
If that looks OK (iow, there is at least one corresponding log file
entry), let's consider 2. Periodically gcSecVBitTable() walks over
said AVL tree. If all 16 bytes in a given chunk are completely
defined or completely undefined, then the chunk is redundant, and
can be deleted from the tree. There are two complications, though:
(a) we don't want to be chucking lines out of the tree too
enthusiastically, for performance reasons. So a line has to
have no part-defined bytes for MAX_STALE_AGE consecutive
checks before it gets dumped.
(b) we can't delete nodes from the tree at the same time we're
iterating over it (using VG_(OSet_Next)). So instead, the
survivor lines are copied into a new tree (OSet) and the old one
is nuked afterwards.
So anyway, you see "if ( keep )" at line 918. In the case (!keep),
add a printf to show "n->a" of the line being dumped and see if
any dumped line matches the missing one causing the assertion failure.
Hmm, on rereading previous messages, all of (2) is irrelevant if
you disabled gcSecVBitTable and the problem still exists. So
it's either 1. or 3. Can you at least try 1. ?
J
On Thursday 06 December 2007 22:43, Nicholas Nethercote wrote:
> On Thu, 6 Dec 2007, Tom Hughes wrote:
> >>> Memcheck: mc_main.c:957 (get_sec_vbits8): Assertion 'n' failed.
> >>> Memcheck: get_sec_vbits8: no node for address 0x6FA9EA0 (0x6FA9EAC)
> >>
> >> It's a problem with the secondary V bits table in Memcheck. That table
> >> holds the full V bits for all memory bytes that are partially defined.
> >> It's happened a couple of times, but always in situations that are
> >> impossible for me to reproduce. If you are able to reduce it to a small
> >> test, or are able to do any debugging yourself, that would be very
> >> helpful.
> >
> > It is 100% repeatable for me but, interestingly, only on my machine at
> > home. My machine at work doesn't have the same problem. Both are
> > x86_64 machines with two cores and 4Gb of memory and both are running
> > Fedora 8!
> > [...]
> > Any suggestions for the best way to debug it?
>
> The relevant code starts with this line, around line 838:
>
> /* --------------- Secondary V bit table ------------ */
>
> It's a fairly basic data structure, the only notable thing is that we
> periodically garbage collect (GC) it, ie. remove stale nodes. The easy
> first thing to try is to turn off the GC, ie. make gcSecVBitTable() do
> nothing. If that makes the problem go away, then we know that the GC is
> removing nodes it shouldn't.
>
> It might also be useful if you can run with -v. The "memcheck GC" lines
> indicate when GCs are happening.
>
> Nick
>
> -------------------------------------------------------------------------
> SF.Net email is sponsored by:
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://sourceforge.net/services/buy/index.php
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
|
|
From: Julian S. <js...@ac...> - 2007-12-07 00:05:15
|
On Friday 07 December 2007 00:05, Rich Coe wrote: > I was running the regtest for RC2 when it stopped running at nanotest2. You mean nanoleak2 ? What distro and architecture is this with? J > Looking at the stderr output, I see valgrind prompting for input. > Here is ths first prompt: > > ==17779== Conditional jump or move depends on uninitialised value(s) > ==17779== at 0x4016701: strlen (in /lib/ld-2.5.so) > ==17779== by 0x4007DDC: _dl_init_paths (in /lib/ld-2.5.so) > ==17779== by 0x40033CE: dl_main (in /lib/ld-2.5.so) > ==17779== by 0x4014A05: _dl_sysdep_start (in /lib/ld-2.5.so) > ==17779== by 0x4000C2F: _dl_start (in /lib/ld-2.5.so) > ==17779== by 0x4000816: (within /lib/ld-2.5.so) > ==17779== > ==17779== ---- Print suppression ? --- [Return/N/n/Y/y/C/c] ---- > > > On Thu, 6 Dec 2007 16:40:29 +0100 > > Julian Seward <js...@ac...> wrote: > > > A release candidate for Valgrind 3.3.0 (3.3.0.RC1) is available for > > > testing from [...] > > > > Various people downloaded and tried RC1; thanks for the feedback. > > I have put a second RC at > > > > http://www.valgrind.org/downloads/valgrind-3.3.0.RC2.tar.bz2 > > (MD5 = 735819e3e8d774861326a51f991e13e5). > > > > Unless any critical breakage shows up, I plan to ship this as > > 3.3.0 final at the weekend. > > > > J > > > > ------------------------------------------------------------------------- > > SF.Net email is sponsored by: The Future of Linux Business White Paper > > from Novell. From the desktop to the data center, Linux is going > > mainstream. Let it simplify your IT future. > > http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4 > > _______________________________________________ > > Valgrind-developers mailing list > > Val...@li... > > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |