You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(1) |
2
(8) |
3
(7) |
4
(16) |
5
|
|
6
(3) |
7
(4) |
8
(1) |
9
(1) |
10
(4) |
11
(5) |
12
(1) |
|
13
|
14
(4) |
15
(2) |
16
|
17
(2) |
18
(9) |
19
(5) |
|
20
(9) |
21
(7) |
22
(9) |
23
(5) |
24
|
25
(1) |
26
|
|
27
|
28
(1) |
29
(11) |
30
(6) |
31
|
|
|
|
From: Julian S. <js...@ac...> - 2002-10-22 13:24:14
|
Ok, it was just a sanity check. I'm assuming you've got this all figured out and will clean up and/or nuke get_current_tid_1_if_root as you see fit. J On Tuesday 22 October 2002 8:35 am, Jeremy Fitzhardinge wrote: > On Mon, 2002-10-21 at 21:45, Julian Seward wrote: > Nick: I spotted the infamous (?) VG_(get_current_tid_1_if_root) > and specifically this: > > if (0 == vg_tid_currently_in_baseBlock) > return 1; /* root thread */ > > What's the meaning of the 0 here? Where is it set? AFAICS the only > valid values of vg_tid_currently_in_baseBlock are either 1 .. > VG_N_THREADS or VG_INVALID_THREADID, so I guess I'm missing something here? > > No, I don't think so. I don't think that function means anything at > all. I was very close to removing it altogether. > > I don't feel like I'm seeing a consistent story I'm happy with re > vg_tid_currently_in_baseBlock -- can you two clarify? > > I put the stronger assert into save_thread_state() because I couldn't > see any good reason why the tid argument should ever mismatch the value > of vg_tid_currently_in_baseBlock. The assertion failed on the first > thread create, but I think my fix makes sense. Thread creation copies > the parent thread's context into the child via the baseBlock. In a > sense the ownership of the baseBlock changes during the copy, which is > what the assignment to vg_tid_currently_in_baseBlock indicates: > > VG_(load_thread_state)(parent_tid); /* load parent thread state into > baseBlock */ + vg_tid_currently_in_baseBlock = tid; /* give ownership of > baseBlock to child */ VG_(save_thread_state)(tid); /* save new state into > new thread */ > > > It seems like, with that change, it will be impossible to > know from the value of vg_tid_currently_in_baseBlock whether or > not baseBlock is "full". > > No, because save_thread_state() assigns vg_tid_currently_in_baseBlock > withVG_INVALID_THREADID, so you know the baseBlock contains nothing > afterwards. It gets reassigned with a valid tid on load_thread_state(). > > J |
|
From: Nicholas N. <nj...@ca...> - 2002-10-22 08:44:43
|
On 22 Oct 2002, Jeremy Fitzhardinge wrote: > And it's not like it was completely broken. Only with multithreaded > programs which use syscalls or allocate memory. Apart from that it was > fine :-) Well, 'date' is my standard test program... ;) N |
|
From: Jeremy F. <je...@go...> - 2002-10-22 08:38:55
|
On Tue, 2002-10-22 at 01:15, Nicholas Nethercote wrote:
I was probably missing something when I wrote it. It always felt nasty to
me but I thought it was necessary. IIRC VG_(get_current_tid) returns 0 in
some circumstances, I can't remember and/or never understood why. Because
thread #0 is some special reserved thread or something, and thread #1 is
meant to be the root thread, which I thought corresponded to the only
thread for non-threaded programs. But that may be totally bogus. Maybe
the #0 being returned was related to the system call wrong-tid bug, I
don't know.
I think so. I'll work out its proper fate tomorrow. The whole "who's
running now?" question is quite subtle in Valgrind (like any OS).
Jeremy, by now you understand Helgrind much better than I, if you think it
should go then please kill it! You must be getting a fairly good idea of
how little time I spent testing Helgrind. I just hope that it was quicker
for you to fix the existing code rather than to rewrite it from scratch.
Well, I'm not saying I won't have touched every line by the time I'm
done, but it has certainly been a lot easier fixing bugs than starting
from scratch. And it's not like it was completely broken. Only with
multithreaded programs which use syscalls or allocate memory. Apart
from that it was fine :-)
J
|
|
From: Nicholas N. <nj...@ca...> - 2002-10-22 08:15:48
|
On Tue, 22 Oct 2002, Julian Seward wrote: > Merging more stuff in. I just did 08-skin-clientreq. I am a heinous slacker. Thanks. > Nick: I spotted the infamous (?) VG_(get_current_tid_1_if_root) > and specifically this: > > if (0 == vg_tid_currently_in_baseBlock) > return 1; /* root thread */ > > What's the meaning of the 0 here? Where is it set? AFAICS the only > valid values of vg_tid_currently_in_baseBlock are either 1 .. VG_N_THREADS > or VG_INVALID_THREADID, so I guess I'm missing something here? I was probably missing something when I wrote it. It always felt nasty to me but I thought it was necessary. IIRC VG_(get_current_tid) returns 0 in some circumstances, I can't remember and/or never understood why. Because thread #0 is some special reserved thread or something, and thread #1 is meant to be the root thread, which I thought corresponded to the only thread for non-threaded programs. But that may be totally bogus. Maybe the #0 being returned was related to the system call wrong-tid bug, I don't know. Jeremy, by now you understand Helgrind much better than I, if you think it should go then please kill it! You must be getting a fairly good idea of how little time I spent testing Helgrind. I just hope that it was quicker for you to fix the existing code rather than to rewrite it from scratch. N |
|
From: Jeremy F. <je...@go...> - 2002-10-22 07:44:09
|
On Mon, 2002-10-21 at 22:26, Julian Seward wrote:
Ok, I've merged in the following:
08-skin-clientreq
13-track-condvar-mutex (partially; see previous msg)
14-sprintf
17-hg-generic-mutex
18-hg-err-reporting
19-hg-context
20-hg-secmap
21-hg-dupwrite
Great!
Seems to run a lot more quietly now on pth_threadpool, and on mozilla
too.
I still get errors out of stdio with pth_threadpool; I haven't tried mozilla.
All the complaints about dl_num_relocations have disappeared;
any idea how/why? (am not complaining; just curious).
18-hg-err-reporting adds a change which puts memory locations which have
had an error reported into an error state (Vg_Excl, with the magic tid
of TID_INDICATING_ALL; the intended meaning is "exclusively owned by
everyone"). It should report everything at least once.
J
|
|
From: Jeremy F. <je...@go...> - 2002-10-22 07:35:37
|
On Mon, 2002-10-21 at 21:45, Julian Seward wrote:
Nick: I spotted the infamous (?) VG_(get_current_tid_1_if_root)
and specifically this:
if (0 == vg_tid_currently_in_baseBlock)
return 1; /* root thread */
What's the meaning of the 0 here? Where is it set? AFAICS the only
valid values of vg_tid_currently_in_baseBlock are either 1 .. VG_N_THREADS
or VG_INVALID_THREADID, so I guess I'm missing something here?
No, I don't think so. I don't think that function means anything at
all. I was very close to removing it altogether.
I don't feel like I'm seeing a consistent story I'm happy with re
vg_tid_currently_in_baseBlock -- can you two clarify?
I put the stronger assert into save_thread_state() because I couldn't
see any good reason why the tid argument should ever mismatch the value
of vg_tid_currently_in_baseBlock. The assertion failed on the first
thread create, but I think my fix makes sense. Thread creation copies
the parent thread's context into the child via the baseBlock. In a
sense the ownership of the baseBlock changes during the copy, which is
what the assignment to vg_tid_currently_in_baseBlock indicates:
VG_(load_thread_state)(parent_tid); /* load parent thread state into baseBlock */
+ vg_tid_currently_in_baseBlock = tid; /* give ownership of baseBlock to child */
VG_(save_thread_state)(tid); /* save new state into new thread */
It seems like, with that change, it will be impossible to
know from the value of vg_tid_currently_in_baseBlock whether or
not baseBlock is "full".
No, because save_thread_state() assigns vg_tid_currently_in_baseBlock
withVG_INVALID_THREADID, so you know the baseBlock contains nothing
afterwards. It gets reassigned with a valid tid on load_thread_state().
J
|
|
From: Julian S. <js...@ac...> - 2002-10-22 05:20:26
|
Ok, I've merged in the following: 08-skin-clientreq 13-track-condvar-mutex (partially; see previous msg) 14-sprintf 17-hg-generic-mutex 18-hg-err-reporting 19-hg-context 20-hg-secmap 21-hg-dupwrite Seems to run a lot more quietly now on pth_threadpool, and on mozilla too. All the complaints about dl_num_relocations have disappeared; any idea how/why? (am not complaining; just curious). J |
|
From: Julian S. <js...@ac...> - 2002-10-22 04:39:26
|
[Nick -- q for you below]
Merging more stuff in. I just did 08-skin-clientreq.
Also partially did 13-track-condvar-mutex. I agree, the tracking
was wrong. However, I couldn't see the purpose of the part of
the patch dealing with the value of vg_tid_currently_in_baseBlock
when baseBlock is empty and so didn't merge that bit (shown
below). It seems like, with that change, it will be impossible to
know from the value of vg_tid_currently_in_baseBlock whether or
not baseBlock is "full".
Nick: I spotted the infamous (?) VG_(get_current_tid_1_if_root)
and specifically this:
if (0 == vg_tid_currently_in_baseBlock)
return 1; /* root thread */
What's the meaning of the 0 here? Where is it set? AFAICS the only
valid values of vg_tid_currently_in_baseBlock are either 1 .. VG_N_THREADS
or VG_INVALID_THREADID, so I guess I'm missing something here?
I don't feel like I'm seeing a consistent story I'm happy with re
vg_tid_currently_in_baseBlock -- can you two clarify?
J
@@ -468,7 +468,7 @@ void VG_(save_thread_state) ( ThreadId t
Int i;
const UInt junk = 0xDEADBEEF;
- vg_assert(vg_tid_currently_in_baseBlock != VG_INVALID_THREADID);
+ vg_assert(vg_tid_currently_in_baseBlock == tid);
/* We don't copy out the LDT entry, because it can never be changed
@@ -2168,6 +2168,7 @@ void do__apply_in_new_thread ( ThreadId
/* Copy the parent's CPU state into the child's, in a roundabout
way (via baseBlock). */
VG_(load_thread_state)(parent_tid);
+ vg_tid_currently_in_baseBlock = tid;
VG_(save_thread_state)(tid);
/* Consider allocating the child a stack, if the one it already has
|
|
From: Jeremy F. <je...@go...> - 2002-10-21 21:05:07
|
On Mon, 2002-10-21 at 12:57, Nicholas Nethercote wrote: > As I remember it, the root thread is #1, and if there's no tid in the base > block then it must be the root thread. This could be wrong. Yes, I don't think that assumption is good. If there's no thread in the baseBlock, then there's no current (virtual machine) thread running at all. J |
|
From: Nicholas N. <nj...@ca...> - 2002-10-21 19:57:14
|
On 21 Oct 2002, Jeremy Fitzhardinge wrote: > It was pretty subtle, but easy to fix once found. I'm still wondering > what get_current_tid_1_if_root() is for. It seems that > get_current_tid() can either return a tid, or "no tid" in the case where > there's nothing loaded into the base block. But I don't really > understand the rationale for returning what seems like a random tid if > there's no correct answer. I was thinking of getting rid of it > altogether, but it is used in a couple of other places. As I remember it, the root thread is #1, and if there's no tid in the base block then it must be the root thread. This could be wrong. N |
|
From: Julian S. <js...@ac...> - 2002-10-21 18:03:40
|
On Monday 21 October 2002 4:35 pm, Josef Weidendorfer wrote: > [This is about a problem in the DWARF2 line info reader (this is CCed to > the valgrind developers list)] You sure? I don't see it in the Cc: field of this msg. Josef, You are much more expert about DWARF2 than I am (you debugged it recently, I have never messed with it and I didn't write it; Daniel Berlin (IIRC) provided it originally). Therefore, you are much better able to make a good decision about this than I am -- so please do so! I would only say that if we have to make a decision between good DWARF2 support for the GNU compilers and good support for the Intel compiler, we should of course choose in favour of the GNU compilers. Ideally we can give good support for both, but maybe that is not possible? Understand that valgrind is now pretty much too large for any one person to fully understand (well, it's too big for me really), so if you become the "center of expertise" for DWARF2 stuff and whatever else, that's fine by me! I just understand the core JIT, the threading library and generally keeping it working. J > > Hi Arnaud, > > I created a similar C file, and put the ".debug_line" section from your > assembler into the assembler of my small C prog to get a runnable program > for valgrind, enabling debug output for the DWARF2 line info reader... > > It seems to me that the Intel Fortran compiler generates bad DWARF2 line > info code: It sets "is_stmt" of the state machine in the the line info > reader to be always false. Thus, there is never a statement boundary > generated and therefore never a instruction range/line number mapping > generated for valgrind. > > Please have a look at the DWARF2 specification, Ch. 6.2 > (x86.ddj.com/ftp/manuals/tools/dwarf.pdf). > Perhaps I understand this wrong, but I don't think so. > > I just had a look at the GDB DWARF2 reader... > They completly ignore "is_stmt" when recording line info ;-) > That's the reason "objdump -S" works on files from the the intel fortran > compiler. > > Nevertheless, I think you should write a bug report to Intel regarding > this. I would be very interested in the response. > > (this is CCed to the valgrind developers list:) > Julian, should we ignore the setting of "is_stmt", too? > > This would mean to always override the "default_is_stmt" with "true" in the > line info debug header: In vg_symtab2.c, search the line > > info.li_default_is_stmt = * ((UChar > *)(external->li_default_is_stmt)); > > and replace it with > > info.li_default_is_stmt = true; > > > Josef > > On Monday 21 October 2002 15:15, you wrote: > > Hi Josef, > > > > I noticed your patch to vg_symtab2.c which is very useful as valgrind was > > fairly > > unusable with recent versions of gcc. Thanks for that ! > > A friend of mine is using the Intel Fortran compiler ifc 6.0 and mailed > > me a while > > ago that valgrind couldn't get the line number in some situations. I > > tried valgrind 1.0.4 > > this morning and it turns out that, while a vast improvement with gcc, > > the line number is > > not read anymore for executable generated by ifc. > > > > To summarize the situation related to line number: > > g77 3.2 and later ifc 6.0 > > valgrind 1.0.3 No Sometines > > valgrind 1.0.4 Yes ! No > > > > I realised that ifc is not very easily available (although it can be use > > freely for > > non commerical purpose > > http://www.intel.com/software/products/compilers/f60l/noncom.htm). > > > > However, you may know why it goes wrong: > > >cat tt1.f > > > > program tt1 > > integer i, j > > if (i .gt. 1) j = 1 ! i not set > > end > > > > >ifc -g -O0 tt1.f > > > > program TT1 > > > > 4 Lines Compiled > > > > >valgrind ./a.out > > > > ==5376== valgrind-1.0.4, a memory error detector for x86 GNU/Linux. > > ==5376== Copyright (C) 2000-2002, and GNU GPL'd, by Julian Seward. > > ==5376== Estimated CPU clock rate is 801 MHz > > ==5376== For more details, rerun with: -v > > ==5376== > > ==5376== Conditional jump or move depends on uninitialised value(s) > > ==5376== at 0x804A0FA: main (in /miami/academic/arnaud/tmp/tmp/a.out) > > ==5376== by 0x402F96F7: __libc_start_main > > (../sysdeps/generic/libc-start.c:129) > > ==5376== by 0x8049FA1: __fxstat64@@GLIBC_2.2 (in > > /miami/academic/arnaud/tmp/tmp/a.out) > > ==5376== > > ==5376== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) > > ==5376== malloc/free: in use at exit: 0 bytes in 0 blocks. > > ==5376== malloc/free: 1 allocs, 1 frees, 32 bytes allocated. > > ==5376== For a detailed leak analysis, rerun with: --leak-check=yes > > ==5376== For counts of detected errors, rerun with: -v > > > > >~/valgrind-1.0.3/bin/valgrind ./a.out > > > > ==5402== valgrind-1.0.3, a memory error detector for x86 GNU/Linux. > > ==5402== Copyright (C) 2000-2002, and GNU GPL'd, by Julian Seward. > > ==5402== Estimated CPU clock rate is 802 MHz > > ==5402== For more details, rerun with: -v > > ==5402== > > ==5402== Conditional jump or move depends on uninitialised value(s) > > ==5402== at 0x804A0FA: main (tt1.f:3) > > ==5402== by 0x402FC6F7: __libc_start_main > > (../sysdeps/generic/libc-start.c:129) > > ==5402== by 0x8049FA1: __fxstat64@@GLIBC_2.2 (in > > /miami/academic/arnaud/tmp/tmp/a.out) > > ==5402== > > ==5402== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) > > ==5402== malloc/free: in use at exit: 0 bytes in 0 blocks. > > ==5402== malloc/free: 1 allocs, 1 frees, 32 bytes allocated. > > ==5402== For a detailed leak analysis, rerun with: --leak-check=yes > > ==5402== For counts of detected errors, rerun with: -v > > > > I attach the assembler generated by "ifc -g -O0 -S". > > > > Best regards, > > Arnaud |
|
From: Jeremy F. <je...@go...> - 2002-10-21 15:53:10
|
On Mon, 2002-10-21 at 02:02, Nicholas Nethercote wrote:
On 18 Oct 2002, Jeremy Fitzhardinge wrote:
> 14-hg-tid
> HELGRIND: This fixes a bug in Helgrind in which all memory access by
> syscalls was being treated as if it were happening in thread 1. This is
> because the eraser_mem_read/write functions were using
> get_current_tid_1_if_root() to get the current tid. Unfortunately,
> during syscalls there is no current thread, so it was getting 1_if_root.
> This patch fixes this by using what thread ID information we're given,
> and only using get_current_tid() if we're recording a memory access
> performed by code (rather than by a syscall).
Well done for spotting this. I suspected/hoped that there would be a
single bug that accounted for a lot of the garbage output. Accept my
apologies for my wretched coding in the first place :)
It was pretty subtle, but easy to fix once found. I'm still wondering
what get_current_tid_1_if_root() is for. It seems that
get_current_tid() can either return a tid, or "no tid" in the case where
there's nothing loaded into the base block. But I don't really
understand the rationale for returning what seems like a random tid if
there's no correct answer. I was thinking of getting rid of it
altogether, but it is used in a couple of other places.
J
|
|
From: Nicholas N. <nj...@ca...> - 2002-10-21 09:02:15
|
On 18 Oct 2002, Jeremy Fitzhardinge wrote: > 14-hg-tid > HELGRIND: This fixes a bug in Helgrind in which all memory access by > syscalls was being treated as if it were happening in thread 1. This is > because the eraser_mem_read/write functions were using > get_current_tid_1_if_root() to get the current tid. Unfortunately, > during syscalls there is no current thread, so it was getting 1_if_root. > This patch fixes this by using what thread ID information we're given, > and only using get_current_tid() if we're recording a memory access > performed by code (rather than by a syscall). Well done for spotting this. I suspected/hoped that there would be a single bug that accounted for a lot of the garbage output. Accept my apologies for my wretched coding in the first place :) N |
|
From: Nicholas N. <nj...@ca...> - 2002-10-21 08:59:51
|
On Sun, 20 Oct 2002, Julian Seward wrote: > > OK, I've updated http://www.goop.org/~jeremy/valgrind - it doesn't look > > like 08-skin-clientreq has been merged yet. > > Nick, do you see any potential problems if I merge in 08-skin-clientreq? > [just a sanity check before I do so ...] Nope, I haven't done it only because I am slack. I will do it today. N |
|
From: Jeremy F. <je...@go...> - 2002-10-21 07:57:57
|
I have a few more updates:
13-track-condvar-mutex
This fixes mutex lock/unlock tracking. In particular, it gets
tracking of mutex ownership over condition variables correct.
14-sprintf
Update to core VG_(printf)/sprintf/vprintf. They've been modified to
return the number of characters they generated (either printed, put into
the buffer, or sent). Also adds a new %y format, which takes an Addr
argument and looks up a symbol. It takes a '(' flag (ie: "%(y") which
surrounds the symbol in parens if it could be found.
18-hg-err-reporting
HELGRIND: show more information about the address we're reporting a
possible data race for; in particular, try to describe where the address
came from (static variable, or heap allocated and if so where?) (Mostly
stolen from memcheck). Also puts memory locations involved with an error
into an error state, so that duplicate errors are suppressed. Also
displays the last good set of locks for a memory location.
13-track-condvar-mutex is a straight bugfix. 14-sprintf is just a
utility update for 18 (though it is generally useful); 18 probably also
depends on 17-hg-generic-mutex.
The usual place: http://www.goop.org/~jeremy/valgrind
J
|
|
From: Jeremy F. <je...@go...> - 2002-10-20 22:50:05
|
On Sun, 2002-10-20 at 15:21, Julian Seward wrote: > Jeremy, I see 17-hg-generic-mutex. I presume this is infrastructure for > the thread-segments stuff, or something like that. I've got a number of plans for this: - keep a graph of mutex dependencies so that incorrect locking order can be detected - keep track of mutex lifetimes to handle mutexes in allocated memory which gets freed - support non-pthreads threading using client requests to indicate lock/unlock/context switch - record execution context when a mutex state changes, so it can be reported - and some other stuff (lots use uses for maintaining per-mutex information) I think the thread segments stuff only needs a redefinition of tid (which fits in well with the non-pthreads changes as well). J |
|
From: Julian S. <js...@ac...> - 2002-10-20 22:15:38
|
On Sunday 20 October 2002 10:40 pm, Jeremy Fitzhardinge wrote: > On Sun, 2002-10-20 at 13:54, Jeremy Fitzhardinge wrote: > On Sun, 2002-10-20 at 13:15, Julian Seward wrote: > I think Nick has already merged 08-skin-clientreq. > > Excellent. I'll merge and regenerate my patchset. > > OK, I've updated http://www.goop.org/~jeremy/valgrind - it doesn't look > like 08-skin-clientreq has been merged yet. Hmm. You're right; my memory is wrong. Nick, do you see any potential problems if I merge in 08-skin-clientreq? [just a sanity check before I do so ...] Jeremy, I see 17-hg-generic-mutex. I presume this is infrastructure for the thread-segments stuff, or something like that. J |
|
From: Jeremy F. <je...@go...> - 2002-10-20 21:40:09
|
On Sun, 2002-10-20 at 13:54, Jeremy Fitzhardinge wrote:
On Sun, 2002-10-20 at 13:15, Julian Seward wrote:
I think Nick has already merged 08-skin-clientreq.
Excellent. I'll merge and regenerate my patchset.
OK, I've updated http://www.goop.org/~jeremy/valgrind - it doesn't look
like 08-skin-clientreq has been merged yet.
J
|
|
From: Jeremy F. <je...@go...> - 2002-10-20 20:54:06
|
On Sun, 2002-10-20 at 13:15, Julian Seward wrote:
Hi. Thx for the helgrind fixes. I merged the following patches into
the head this afternoon:
03-poll-select (is also in 1.0.4)
04-lax-ioctls (is also in 1.0.4)
07-seginfo
13-data-syms
14-hg-tid
06-memops
15-hg-datasym
16-ld-nodelete
I think Nick has already merged 08-skin-clientreq.
Excellent. I'll merge and regenerate my patchset.
I just tried a run of mozilla-1.0.1 on the fully patched helgrind.
There are bazillions of errors reported for the _dl_num_relocations
problem which we already know about (even with LD_BIND_NOW=y). If
those were to be got rid of somehow, the remaining ones are sufficiently
few that they might actually tell us something useful.
Yes, LD_BIND_NOW will only work for simple dynamically linked programs
which do all their linking at startup (ie, while single threaded).
Presumably Mozilla is still dynamically linking things well into its
life, with multiple threads.
Which is a step in the right direction.
Any ideas how to get rid of this blasted _dl_num_relocations thing?
Perhaps helgrind can check to see if the variable in question is
indeed _dl_num_relocations, and suppress the error if so?
Well, obviously Helgrind needs to be taught how to understand
suppressions, and we need to start building a standard suppression
file. I also think the handling of duplicate errors needs to be a bit
different from Memcheck and so on. Memcheck considers the error context
to be the same if there's matching portions of the stack backtrace. I
think for helgrind, the memory location itself is all that's
interesting: it doesn't matter what the call frames are (for the
purposes of suppression; obviously you need them for reporting).
Also, the increment of _dl_num_locations is probably an atomic inc
anyway (probably a simple addl $1, _dl_num_locations), so it doesn't
really count as an error at all. I'm not planning on teaching helgrind
about atomic operations just yet, but it is an obvious path for future
work.
Another interesting test is tests/unused/pth_threadpool; this gives
a large number of errors, many of which pertain to _IO_2_1_stdout_.
Yes, I noticed there were a number of reports coming out of stdio, but I
haven't looked into it yet. Quite possibly they're atomic ops as well.
Or just bugs.
J
|
|
From: Jeremy F. <je...@go...> - 2002-10-20 20:53:44
|
On Sun, 2002-10-20 at 13:33, Julian Seward wrote:
How about just link in our libpthread.so under all circumstances? Surely
the magic hacks it does for some functions (select, poll, etc) are harmless
for single-threaded programs, just a little inefficient? Is there any
advantage to the added complication of switching behaviours depending on
whether or not a second thread has been created? All else being equal
I prefer to avoid complexity like that.
It leaks the pthread names into programs which didn't ask for
libpthread. But that's pretty likely anyway, given the number of
libraries which pull in libpthread for you.
Also, some of libpthread's substitutions may not be functionally
identical to the real functions, so it would be nice to avoid
introducing that complexity unless it is necessary (I don't think that's
the case at the moment, but it is true for the non-blocking open, which
is impossible to emulate exactly).
So I don't have a strong objection to always bringing in libpthread, but
I do think we should just use the standard libc implementations unless
we actually need the threaded versions.
J
|
|
From: Julian S. <js...@ac...> - 2002-10-20 20:27:26
|
> Not entirely sure what the outcome of this is -- can you summarise? I > get the impression that it _should_ be OK -- libpthread.so should override > libc -- but perhaps I'm wrong? > > Yes, but... The case where a program binds to some symbols, then > (implicitly) loads libpthread and becomes multithreaded is difficult, > because any previously bound references will remain bound. glibc w/ > pthreads has this problem as well (libpthread wants to intercept fork(), > but may not get to it in time). > > I think the safe and sure way of making this always work is to make > valgrind.so define all the symbols we want to intercept, and then have > it dispatch them out to our libpthread or into the real libc, depending > on whether a second thread has been created. I think this is certain to > catch all the references we want to catch under all circumstances. How about just link in our libpthread.so under all circumstances? Surely the magic hacks it does for some functions (select, poll, etc) are harmless for single-threaded programs, just a little inefficient? Is there any advantage to the added complication of switching behaviours depending on whether or not a second thread has been created? All else being equal I prefer to avoid complexity like that. > A couple of other points. > > 1. Your __select and __poll renamings suffer from the same problem. > > As do all the libc intercepts in libpthread.so (select and __select are > strong aliases and are therefore indistinguishable Blargh. J |
|
From: Julian S. <js...@ac...> - 2002-10-20 20:09:20
|
Hi. Thx for the helgrind fixes. I merged the following patches into the head this afternoon: 03-poll-select (is also in 1.0.4) 04-lax-ioctls (is also in 1.0.4) 07-seginfo 13-data-syms 14-hg-tid 06-memops 15-hg-datasym 16-ld-nodelete I think Nick has already merged 08-skin-clientreq. I just tried a run of mozilla-1.0.1 on the fully patched helgrind. There are bazillions of errors reported for the _dl_num_relocations problem which we already know about (even with LD_BIND_NOW=y). If those were to be got rid of somehow, the remaining ones are sufficiently few that they might actually tell us something useful. Which is a step in the right direction. Any ideas how to get rid of this blasted _dl_num_relocations thing? Perhaps helgrind can check to see if the variable in question is indeed _dl_num_relocations, and suppress the error if so? Another interesting test is tests/unused/pth_threadpool; this gives a large number of errors, many of which pertain to _IO_2_1_stdout_. J |
|
From: Jeremy F. <je...@go...> - 2002-10-20 18:35:52
|
On Sun, 2002-10-20 at 04:00, Julian Seward wrote:
Jeremy
Not entirely sure what the outcome of this is -- can you summarise? I get
the impression that it _should_ be OK -- libpthread.so should override libc
-- but perhaps I'm wrong?
Yes, but... The case where a program binds to some symbols, then
(implicitly) loads libpthread and becomes multithreaded is difficult,
because any previously bound references will remain bound. glibc w/
pthreads has this problem as well (libpthread wants to intercept fork(),
but may not get to it in time).
I think the safe and sure way of making this always work is to make
valgrind.so define all the symbols we want to intercept, and then have
it dispatch them out to our libpthread or into the real libc, depending
on whether a second thread has been created. I think this is certain to
catch all the references we want to catch under all circumstances.
A couple of other points.
1. Your __select and __poll renamings suffer from the same problem.
As do all the libc intercepts in libpthread.so (select and __select are
strong aliases and are therefore indistinguishable
2. I didn't mention this before, but ... you might want to play with the
coregrind/dosyms script. This compares the exported symbols from
V's libpthread.so vs those from the standard one; and it is how I
navigate this swamp -- to a first approximation I try and make
V's libpthread.so export the same syms as the standard one.
OK. It looks to me like the version script mechanism can also be used
to make sure there's no symbol namespace leakage (ie, it can enforce the
rule that only vgPlain_* symbols are visible outside the .so).
J
|
|
From: Julian S. <js...@ac...> - 2002-10-20 10:54:12
|
Jeremy
Not entirely sure what the outcome of this is -- can you summarise? I get
the impression that it _should_ be OK -- libpthread.so should override libc
-- but perhaps I'm wrong?
A couple of other points.
1. Your __select and __poll renamings suffer from the same problem.
2. I didn't mention this before, but ... you might want to play with the
coregrind/dosyms script. This compares the exported symbols from
V's libpthread.so vs those from the standard one; and it is how I
navigate this swamp -- to a first approximation I try and make
V's libpthread.so export the same syms as the standard one.
J
On Saturday 19 October 2002 2:50 am, Jeremy Fitzhardinge wrote:
> On Fri, 2002-10-18 at 17:52, H. J. Lu wrote:
> > > OK, but valgrind.so is already being linked with "-z initfirst"; what
> > > happens if there are two .so files with initfirst? (It does seem to
> > > work).
> >
> > Which ever comes first wins
>
> So if the order of events is:
>
> 1. run executable A, with LD_PRELOAD=valgrind.so (which has initfirst
> set)
> 2. A uses function foo() which is defined in libc.so
> 3. A doesn't use libpthread, but after a while it dlopens libB.so, which
> does
> 4. libB.so pulls in Valgrind's libpthread.so.
> 5. Valgrind's libpthread.so wants to override foo(), now that this has
> become a multithreaded program. If A uses foo() again, which definition
> will it get? libc's original one (which it was using before), or the
> new definition in libpthread.so?
>
> I wonder if the solution is to always pull in Valgrind's libpthread.so,
> but only make it do special stuff once there's more than one thread...
>
> Thanks,
> J
>
>
>
> -------------------------------------------------------
> This sf.net email is sponsored by:
> Access Your PC Securely with GoToMyPC. Try Free Now
> https://www.gotomypc.com/s/OSND/DD
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
|
|
From: H. J. L. <hj...@lu...> - 2002-10-19 06:02:04
|
On Fri, Oct 18, 2002 at 06:50:02PM -0700, Jeremy Fitzhardinge wrote: > On Fri, 2002-10-18 at 17:52, H. J. Lu wrote: > > > OK, but valgrind.so is already being linked with "-z initfirst"; what > > > happens if there are two .so files with initfirst? (It does seem to > > > work). > > > > Which ever comes first wins > > So if the order of events is: > > 1. run executable A, with LD_PRELOAD=valgrind.so (which has initfirst > set) > 2. A uses function foo() which is defined in libc.so > 3. A doesn't use libpthread, but after a while it dlopens libB.so, which > does > 4. libB.so pulls in Valgrind's libpthread.so. > 5. Valgrind's libpthread.so wants to override foo(), now that this has > become a multithreaded program. If A uses foo() again, which definition > will it get? libc's original one (which it was using before), or the > new definition in libpthread.so? > I think A will always use the original foo. BTW, if libB.so is dlopened, it becomes very tricky. See http://sources.redhat.com/ml/libc-alpha/2002-05/msg00214.html H.J. |