From: Jeremy F. <je...@go...> - 2002-10-21 07:57:57
|
I have a few more updates: 13-track-condvar-mutex This fixes mutex lock/unlock tracking. In particular, it gets tracking of mutex ownership over condition variables correct. 14-sprintf Update to core VG_(printf)/sprintf/vprintf. They've been modified to return the number of characters they generated (either printed, put into the buffer, or sent). Also adds a new %y format, which takes an Addr argument and looks up a symbol. It takes a '(' flag (ie: "%(y") which surrounds the symbol in parens if it could be found. 18-hg-err-reporting HELGRIND: show more information about the address we're reporting a possible data race for; in particular, try to describe where the address came from (static variable, or heap allocated and if so where?) (Mostly stolen from memcheck). Also puts memory locations involved with an error into an error state, so that duplicate errors are suppressed. Also displays the last good set of locks for a memory location. 13-track-condvar-mutex is a straight bugfix. 14-sprintf is just a utility update for 18 (though it is generally useful); 18 probably also depends on 17-hg-generic-mutex. The usual place: http://www.goop.org/~jeremy/valgrind J |
From: Julian S. <js...@ac...> - 2002-10-22 04:39:26
|
[Nick -- q for you below] Merging more stuff in. I just did 08-skin-clientreq. Also partially did 13-track-condvar-mutex. I agree, the tracking was wrong. However, I couldn't see the purpose of the part of the patch dealing with the value of vg_tid_currently_in_baseBlock when baseBlock is empty and so didn't merge that bit (shown below). It seems like, with that change, it will be impossible to know from the value of vg_tid_currently_in_baseBlock whether or not baseBlock is "full". Nick: I spotted the infamous (?) VG_(get_current_tid_1_if_root) and specifically this: if (0 == vg_tid_currently_in_baseBlock) return 1; /* root thread */ What's the meaning of the 0 here? Where is it set? AFAICS the only valid values of vg_tid_currently_in_baseBlock are either 1 .. VG_N_THREADS or VG_INVALID_THREADID, so I guess I'm missing something here? I don't feel like I'm seeing a consistent story I'm happy with re vg_tid_currently_in_baseBlock -- can you two clarify? J @@ -468,7 +468,7 @@ void VG_(save_thread_state) ( ThreadId t Int i; const UInt junk = 0xDEADBEEF; - vg_assert(vg_tid_currently_in_baseBlock != VG_INVALID_THREADID); + vg_assert(vg_tid_currently_in_baseBlock == tid); /* We don't copy out the LDT entry, because it can never be changed @@ -2168,6 +2168,7 @@ void do__apply_in_new_thread ( ThreadId /* Copy the parent's CPU state into the child's, in a roundabout way (via baseBlock). */ VG_(load_thread_state)(parent_tid); + vg_tid_currently_in_baseBlock = tid; VG_(save_thread_state)(tid); /* Consider allocating the child a stack, if the one it already has |
From: Jeremy F. <je...@go...> - 2002-10-22 07:35:37
|
On Mon, 2002-10-21 at 21:45, Julian Seward wrote: Nick: I spotted the infamous (?) VG_(get_current_tid_1_if_root) and specifically this: if (0 == vg_tid_currently_in_baseBlock) return 1; /* root thread */ What's the meaning of the 0 here? Where is it set? AFAICS the only valid values of vg_tid_currently_in_baseBlock are either 1 .. VG_N_THREADS or VG_INVALID_THREADID, so I guess I'm missing something here? No, I don't think so. I don't think that function means anything at all. I was very close to removing it altogether. I don't feel like I'm seeing a consistent story I'm happy with re vg_tid_currently_in_baseBlock -- can you two clarify? I put the stronger assert into save_thread_state() because I couldn't see any good reason why the tid argument should ever mismatch the value of vg_tid_currently_in_baseBlock. The assertion failed on the first thread create, but I think my fix makes sense. Thread creation copies the parent thread's context into the child via the baseBlock. In a sense the ownership of the baseBlock changes during the copy, which is what the assignment to vg_tid_currently_in_baseBlock indicates: VG_(load_thread_state)(parent_tid); /* load parent thread state into baseBlock */ + vg_tid_currently_in_baseBlock = tid; /* give ownership of baseBlock to child */ VG_(save_thread_state)(tid); /* save new state into new thread */ It seems like, with that change, it will be impossible to know from the value of vg_tid_currently_in_baseBlock whether or not baseBlock is "full". No, because save_thread_state() assigns vg_tid_currently_in_baseBlock withVG_INVALID_THREADID, so you know the baseBlock contains nothing afterwards. It gets reassigned with a valid tid on load_thread_state(). J |
From: Julian S. <js...@ac...> - 2002-10-22 13:24:14
|
Ok, it was just a sanity check. I'm assuming you've got this all figured out and will clean up and/or nuke get_current_tid_1_if_root as you see fit. J On Tuesday 22 October 2002 8:35 am, Jeremy Fitzhardinge wrote: > On Mon, 2002-10-21 at 21:45, Julian Seward wrote: > Nick: I spotted the infamous (?) VG_(get_current_tid_1_if_root) > and specifically this: > > if (0 == vg_tid_currently_in_baseBlock) > return 1; /* root thread */ > > What's the meaning of the 0 here? Where is it set? AFAICS the only > valid values of vg_tid_currently_in_baseBlock are either 1 .. > VG_N_THREADS or VG_INVALID_THREADID, so I guess I'm missing something here? > > No, I don't think so. I don't think that function means anything at > all. I was very close to removing it altogether. > > I don't feel like I'm seeing a consistent story I'm happy with re > vg_tid_currently_in_baseBlock -- can you two clarify? > > I put the stronger assert into save_thread_state() because I couldn't > see any good reason why the tid argument should ever mismatch the value > of vg_tid_currently_in_baseBlock. The assertion failed on the first > thread create, but I think my fix makes sense. Thread creation copies > the parent thread's context into the child via the baseBlock. In a > sense the ownership of the baseBlock changes during the copy, which is > what the assignment to vg_tid_currently_in_baseBlock indicates: > > VG_(load_thread_state)(parent_tid); /* load parent thread state into > baseBlock */ + vg_tid_currently_in_baseBlock = tid; /* give ownership of > baseBlock to child */ VG_(save_thread_state)(tid); /* save new state into > new thread */ > > > It seems like, with that change, it will be impossible to > know from the value of vg_tid_currently_in_baseBlock whether or > not baseBlock is "full". > > No, because save_thread_state() assigns vg_tid_currently_in_baseBlock > withVG_INVALID_THREADID, so you know the baseBlock contains nothing > afterwards. It gets reassigned with a valid tid on load_thread_state(). > > J |
From: Nicholas N. <nj...@ca...> - 2002-10-22 08:15:48
|
On Tue, 22 Oct 2002, Julian Seward wrote: > Merging more stuff in. I just did 08-skin-clientreq. I am a heinous slacker. Thanks. > Nick: I spotted the infamous (?) VG_(get_current_tid_1_if_root) > and specifically this: > > if (0 == vg_tid_currently_in_baseBlock) > return 1; /* root thread */ > > What's the meaning of the 0 here? Where is it set? AFAICS the only > valid values of vg_tid_currently_in_baseBlock are either 1 .. VG_N_THREADS > or VG_INVALID_THREADID, so I guess I'm missing something here? I was probably missing something when I wrote it. It always felt nasty to me but I thought it was necessary. IIRC VG_(get_current_tid) returns 0 in some circumstances, I can't remember and/or never understood why. Because thread #0 is some special reserved thread or something, and thread #1 is meant to be the root thread, which I thought corresponded to the only thread for non-threaded programs. But that may be totally bogus. Maybe the #0 being returned was related to the system call wrong-tid bug, I don't know. Jeremy, by now you understand Helgrind much better than I, if you think it should go then please kill it! You must be getting a fairly good idea of how little time I spent testing Helgrind. I just hope that it was quicker for you to fix the existing code rather than to rewrite it from scratch. N |
From: Jeremy F. <je...@go...> - 2002-10-22 08:38:55
|
On Tue, 2002-10-22 at 01:15, Nicholas Nethercote wrote: I was probably missing something when I wrote it. It always felt nasty to me but I thought it was necessary. IIRC VG_(get_current_tid) returns 0 in some circumstances, I can't remember and/or never understood why. Because thread #0 is some special reserved thread or something, and thread #1 is meant to be the root thread, which I thought corresponded to the only thread for non-threaded programs. But that may be totally bogus. Maybe the #0 being returned was related to the system call wrong-tid bug, I don't know. I think so. I'll work out its proper fate tomorrow. The whole "who's running now?" question is quite subtle in Valgrind (like any OS). Jeremy, by now you understand Helgrind much better than I, if you think it should go then please kill it! You must be getting a fairly good idea of how little time I spent testing Helgrind. I just hope that it was quicker for you to fix the existing code rather than to rewrite it from scratch. Well, I'm not saying I won't have touched every line by the time I'm done, but it has certainly been a lot easier fixing bugs than starting from scratch. And it's not like it was completely broken. Only with multithreaded programs which use syscalls or allocate memory. Apart from that it was fine :-) J |
From: Nicholas N. <nj...@ca...> - 2002-10-22 08:44:43
|
On 22 Oct 2002, Jeremy Fitzhardinge wrote: > And it's not like it was completely broken. Only with multithreaded > programs which use syscalls or allocate memory. Apart from that it was > fine :-) Well, 'date' is my standard test program... ;) N |
From: Julian S. <js...@ac...> - 2002-10-22 05:20:26
|
Ok, I've merged in the following: 08-skin-clientreq 13-track-condvar-mutex (partially; see previous msg) 14-sprintf 17-hg-generic-mutex 18-hg-err-reporting 19-hg-context 20-hg-secmap 21-hg-dupwrite Seems to run a lot more quietly now on pth_threadpool, and on mozilla too. All the complaints about dl_num_relocations have disappeared; any idea how/why? (am not complaining; just curious). J |
From: Jeremy F. <je...@go...> - 2002-10-22 07:44:09
|
On Mon, 2002-10-21 at 22:26, Julian Seward wrote: Ok, I've merged in the following: 08-skin-clientreq 13-track-condvar-mutex (partially; see previous msg) 14-sprintf 17-hg-generic-mutex 18-hg-err-reporting 19-hg-context 20-hg-secmap 21-hg-dupwrite Great! Seems to run a lot more quietly now on pth_threadpool, and on mozilla too. I still get errors out of stdio with pth_threadpool; I haven't tried mozilla. All the complaints about dl_num_relocations have disappeared; any idea how/why? (am not complaining; just curious). 18-hg-err-reporting adds a change which puts memory locations which have had an error reported into an error state (Vg_Excl, with the magic tid of TID_INDICATING_ALL; the intended meaning is "exclusively owned by everyone"). It should report everything at least once. J |
From: Julian S. <js...@ac...> - 2002-10-22 13:25:10
|
> All the complaints about dl_num_relocations have disappeared; > any idea how/why? (am not complaining; just curious). > > 18-hg-err-reporting adds a change which puts memory locations which have > had an error reported into an error state (Vg_Excl, with the magic tid > of TID_INDICATING_ALL; the intended meaning is "exclusively owned by > everyone"). It should report everything at least once. Ha; I understand. Good trick. J |