|
From: Bart V. A. <bar...@gm...> - 2006-12-13 18:10:18
|
A new patch is available: http://home.euphonynet.be/bvassche/valgrind/valgrind-6397-drd-2006-12-13.patch.gz Changes compared to the previous version (2006-12-09): - Eliminated the changes to the thread states in Valgrind's core. - Renamed track_detached_thread_finished() into track_thread_finished(). This tracking function is now called independent of whether the thread is a detached or joinable POSIX thread. - Added a tracking function that is called every time a thread name changes (track_thread_name()). - Changed thread name size from 16 to 32 characters. - Decoupled Valgrind's thread ID's (ThreadId) and DRD's thread ID's (DrdThreadId). - Eliminated all thread state information from drd's client-side function wrappers. - When option -v is passed to Valgrind, drd prints some statistics of its own. - Bug fix (thanks to Julian): fixed vector clock updating at pthread_mutex_lock() calls -- the drd now reliably reports the same set of races upon every run. Previously the set of races reported was dependent on how Valgrind scheduled the client's threads. - Various small changes to drd. Next issues I will work on: segment merging, and more accurate error reporting. This should solve the OOM errors triggered by drd and make the tool more usable. Any comments on the proposed core changes and/or the drd tool are welcome. Bart. |
|
From: Nicholas N. <nj...@cs...> - 2006-12-14 03:08:29
|
On Wed, 13 Dec 2006, Bart Van Assche wrote: > A new patch is available: > http://home.euphonynet.be/bvassche/valgrind/valgrind-6397-drd-2006-12-13.patch.gz > > Changes compared to the previous version (2006-12-09): > - Eliminated the changes to the thread states in Valgrind's core. Great! > - Renamed track_detached_thread_finished() into track_thread_finished(). > This tracking function is now called independent of whether the thread is a > detached or joinable POSIX thread. Great! > - Added a tracking function that is called every time a thread name changes > (track_thread_name()). Why is this needed? > - Changed thread name size from 16 to 32 characters. > - Decoupled Valgrind's thread ID's (ThreadId) and DRD's thread ID's > (DrdThreadId). Good. > - Eliminated all thread state information from drd's client-side function > wrappers. > - When option -v is passed to Valgrind, drd prints some statistics of its > own. > - Bug fix (thanks to Julian): fixed vector clock updating at > pthread_mutex_lock() calls -- the drd now reliably reports the same set of > races upon every run. Previously the set of races reported was dependent on > how Valgrind scheduled the client's threads. > - Various small changes to drd. All good! > Next issues I will work on: segment merging, and more accurate error > reporting. This should solve the OOM errors triggered by drd and make the > tool more usable. > > Any comments on the proposed core changes and/or the drd tool are welcome. These changes look great. I appreciate that you have addressed the concerns Julian and I have raised previously. The minimal set of coregrind/ changes is looking good. Nick |
|
From: Bart V. A. <bar...@gm...> - 2006-12-14 09:08:46
|
On 12/14/06, Nicholas Nethercote <nj...@cs...> wrote: > On Wed, 13 Dec 2006, Bart Van Assche wrote: > > > - Added a tracking function that is called every time a thread name changes > > (track_thread_name()). > > Why is this needed? Adding such a tracking function was the easiest way for me to keep thread names in the core and in the drd tool in sync. There is also one modification in the patch I forgot to mention in the summary: I moved the statement VG_TRACK ( post_thread_create, tst->os_state.parent, tid ) in file coregrind/m_syswrap/syswrap-linux.c to just before the statement VG_(set_running)(tid, "thread_wrapper(starting new thread)"). A similar change still has to be made in the AIX system call wrappers. Bart. |
|
From: Nicholas N. <nj...@cs...> - 2006-12-14 10:11:46
|
On Thu, 14 Dec 2006, Bart Van Assche wrote: >> > - Added a tracking function that is called every time a thread name >> changes >> > (track_thread_name()). >> >> Why is this needed? > > Adding such a tracking function was the easiest way for me to keep > thread names in the core and in the drd tool in sync. I couldn't work out how the different thread_name components fit together. Can you briefly describe them? As usual, I'm wondering if we can avoid adding new stuff to the core :) N |
|
From: Bart V. A. <bar...@gm...> - 2006-12-14 10:39:38
|
On 12/14/06, Nicholas Nethercote <nj...@cs...> wrote: > > I couldn't work out how the different thread_name components fit together. > Can you briefly describe them? As usual, I'm wondering if we can avoid > adding new stuff to the core :) This is how it currently works: - The client tells to Valgrind's core which name it wants to associate with each thread via a client request (VG_USERREQ__SET_THREAD_NAME). - When Valgrind's core encounters this client request, it stores the thread name and remembers that thread name until the thread finishes. - Each time a thread name changes, the core calls VG_TRACK(thread_name)(tid, name). - The drd tool registers a tracking function such that it is informed about thread name changes. The drd tool includes these thread names in its error reports. The drd tool also remembers thread names longer than the core. E.g. after a thread finished and before it is joined via pthread_join(), the core already has discarded all information related to that thread but drd still knows about that thread (name, POSIX thread ID, etc.). - Currently Valgrind's core does nothing else with thread names than storing them and making these names available to tools. I recommend however to include thread names in the error reports of all tools. When using e.g. memcheck on multithreaded programs, it is very helpful to know which thread an error report relates to. The only option available now is to switch back from NPTL to linuxthreads, such that the process ID included in each line of output relates uniquely to a thread. But even with linuxthreads I have to make the application keep a file in /tmp up to date with the relationship between pid_t and thread name, such that it becomes possible to translate pid_t's into thread names. Including thread names directly in memcheck's error reports would make memcheck easier to use with multithreaded software. Bart. |
|
From: Julian S. <js...@ac...> - 2006-12-17 17:51:51
|
Well, I can see what you're saying. But like Nick I'm reluctant to add non-essential functionality to the core. I'm still a bit unclear about this. As I understand it, programs that want to see thread names in error reports have to use the client requests specially. So it won't be of benefit to programs that don't use these client requests, correct? Or does drd's pthread wrappers use them in a way which benefits any program run by drd? I think it would help to clarify the way in which you intend this to be used, and to show some error messages from drd with/without thread names. J On Thursday 14 December 2006 10:39, Bart Van Assche wrote: > On 12/14/06, Nicholas Nethercote <nj...@cs...> wrote: > > I couldn't work out how the different thread_name components fit > > together. Can you briefly describe them? As usual, I'm wondering if we > > can avoid adding new stuff to the core :) > > This is how it currently works: > - The client tells to Valgrind's core which name it wants to associate > with each thread via a client request (VG_USERREQ__SET_THREAD_NAME). > - When Valgrind's core encounters this client request, it stores the > thread name and remembers that thread name until the thread finishes. > - Each time a thread name changes, the core calls > VG_TRACK(thread_name)(tid, name). > - The drd tool registers a tracking function such that it is informed > about thread name changes. The drd tool includes these thread names in > its error reports. The drd tool also remembers thread names longer > than the core. E.g. after a thread finished and before it is joined > via pthread_join(), the core already has discarded all information > related to that thread but drd still knows about that thread (name, > POSIX thread ID, etc.). > - Currently Valgrind's core does nothing else with thread names than > storing them and making these names available to tools. > > I recommend however to include thread names in the error reports of > all tools. When using e.g. memcheck on multithreaded programs, it is > very helpful to know which thread an error report relates to. The only > option available now is to switch back from NPTL to linuxthreads, such > that the process ID included in each line of output relates uniquely > to a thread. But even with linuxthreads I have to make the application > keep a file in /tmp up to date with the relationship between pid_t and > thread name, such that it becomes possible to translate pid_t's into > thread names. Including thread names directly in memcheck's error > reports would make memcheck easier to use with multithreaded software. > > Bart. > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share > your opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Duncan S. <bal...@fr...> - 2006-12-14 13:26:09
|
Hi Bart, thanks for this tool! If I understand right, it works like this: if two thread access the same memory in an unserialized way (i.e. there is nothing to force the memory accesses to occur in a particular order) then drd reports which code sections were responsible (1st segment start..1st segment end, 2nd segment start.. 2nd segment end) and shows what the accessed memory was: ==12909== Data addresses accessed by both segments: ==12909== 0x0479107C sz 4 W W (stack of thread 9) ==12909== 0x0479108C sz 32 W W (stack of thread 9) Presumably W W means that both the 1st segment and the 2nd segment wrote to this memory location (I guess W also covers: both read and wrote). I guess sz 32 means it happened to all of the 32 bytes starting at the address. It's not clear to me how to make use of the address information. I guess at some point drd will offer the possibility of attaching the debugger when a race is detected, which would help... Best wishes, Duncan. |
|
From: Bart V. A. <bar...@gm...> - 2006-12-14 14:54:55
|
On 12/14/06, Duncan Sands <bal...@fr...> wrote: > Hi Bart, thanks for this tool! If I understand right, it > works like this: if two thread access the same memory in > an unserialized way (i.e. there is nothing to force the > memory accesses to occur in a particular order) then drd > reports which code sections were responsible > (1st segment start..1st segment end, 2nd segment start.. > 2nd segment end) and shows what the accessed memory was: > > ==12909== Data addresses accessed by both segments: > ==12909== 0x0479107C sz 4 W W (stack of thread 9) > ==12909== 0x0479108C sz 32 W W (stack of thread 9) > > Presumably W W means that both the 1st segment and the > 2nd segment wrote to this memory location (I guess W > also covers: both read and wrote). I guess sz 32 > means it happened to all of the 32 bytes starting at > the address. > > It's not clear to me how to make use of the address > information. I guess at some point drd will offer > the possibility of attaching the debugger when a > race is detected, which would help... There is indeed still some work to do before drd's error reports will be easy to comprehend. I'm not sure however that I will be able to print detailed information about the meaning of data addresses from within the drd tool. The plan is to make the call stacks more precise, such that the call stack information is sufficient to find the cause of the race. This approach is similar to memcheck: memcheck prints a data address + exact call stack. Being able to attach a debugger just after a race is detected would indeed be a nice feature. Bart. |
|
From: Nicholas N. <nj...@cs...> - 2006-12-14 22:13:22
|
On Thu, 14 Dec 2006, Bart Van Assche wrote: > Being able to attach a debugger just after a race is detected would > indeed be a nice feature. The core's error-handling mechanism allows this by default. Just use --db-attach (or whatever it's called). Nick |
|
From: Duncan S. <bal...@fr...> - 2006-12-15 07:43:22
|
> > Being able to attach a debugger just after a race is detected would > > indeed be a nice feature. > > The core's error-handling mechanism allows this by default. Just use > --db-attach (or whatever it's called). I tried it, but it doesn't ask if it should attach the debugger when drd prints a race, only if you hit ctrl-c. Ciao, Duncan. |
|
From: Julian S. <js...@ac...> - 2006-12-15 14:04:58
|
On Friday 15 December 2006 07:43, Duncan Sands wrote: > > > Being able to attach a debugger just after a race is detected would > > > indeed be a nice feature. > > > > The core's error-handling mechanism allows this by default. Just use > > --db-attach (or whatever it's called). > > I tried it, but it doesn't ask if it should attach the debugger > when drd prints a race, only if you hit ctrl-c. Does it make any sense to attach a debugger in drd? A race does not have a single unique source location; it only "happens" when a thread T2 accesses a memory location that some other thread T1 has also accessed, and there is no synchronising event to sequence these accesses. Currently AIUI drd will tell you this happened arbitrarily later than the actual accesses, when it compares the memory-access sets for the two thread (segments). I made a suggestion a few months back called "danger-set" (incorporated as comments in the drd patch) and that would allow drd to say precisely when the second access to a contended location happened. Then it would make sense to attach a debugger. But as it stands I don't see that attaching a debugger would give a meaningful result. J |
|
From: Duncan S. <bal...@fr...> - 2006-12-15 14:18:40
|
> Does it make any sense to attach a debugger in drd? A race does not > have a single unique source location; it only "happens" when a > thread T2 accesses a memory location that some other thread T1 has > also accessed, and there is no synchronising event to sequence these > accesses. Currently AIUI drd will tell you this happened arbitrarily > later than the actual accesses, when it compares the memory-access > sets for the two thread (segments). > > I made a suggestion a few months back called "danger-set" (incorporated > as comments in the drd patch) and that would allow drd to say precisely > when the second access to a contended location happened. Then it would > make sense to attach a debugger. But as it stands I don't see that > attaching a debugger would give a meaningful result. Hi Julian, it's only to try to work out which variable lives at the memory location. Perhaps you know of a better way? Ciao, Duncan. |
|
From: Julian S. <js...@ac...> - 2006-12-15 15:03:59
|
(oops, forgot to cc -dev) On Friday 15 December 2006 15:12, Julian Seward wrote: > > > I made a suggestion a few months back called "danger-set" (incorporated > > > as comments in the drd patch) and that would allow drd to say precisely > > > when the second access to a contended location happened. Then it would > > > make sense to attach a debugger. But as it stands I don't see that > > > attaching a debugger would give a meaningful result. > > > > Hi Julian, it's only to try to work out which variable lives at the > > memory location. Perhaps you know of a better way? > > Ah, ok. I guess that would work OK for globals and heap-resident > data, but not for stack allocated data, since the debugger is started > arbitrarily later than the race point. > > One possibility is to improve V's debug-info reader so it can read > data addresses and make sense of DWARF2/3 data type info. That would > help; we had a similar thing for stabs in the past. It's a significant > amount of work though. > > Another possible improvement is to implement this danger-set > algorithm. That would allow drd to pinpoint the exact instruction > causing the second and all subsequent memory accesses to a > contended memory location, hence give you an exact call stack to > where it happened. That should help a lot in making sense of > drd output. > > J |
|
From: Julian S. <js...@ac...> - 2006-12-17 18:06:38
|
On Thursday 14 December 2006 03:08, Nicholas Nethercote wrote: > > Any comments on the proposed core changes and/or the drd tool are > > welcome. > > These changes look great. I appreciate that you have addressed the > concerns Julian and I have raised previously. The minimal set of > coregrind/ changes is looking good. Yes, I also appreciate that. I presume you are building up a set of test programs as you go along? Those will eventually be useful as regression tests. > > Next issues I will work on: segment merging, and more accurate error > > reporting. This should solve the OOM errors triggered by drd and make the > > tool more usable. Do you plan / have any interest in trying the danger-set idea? I had thought a bit about how to implement the lazy set merging involved. It also avoids having two different pieces of code for 32-bit and 64-bit machines. The idea is to remove the fixed 3-level bitmap and instead use an OSet of pages (of some size). Naively, each memory access then requires a lookup in the OSet, which could be slow. So, associate with the OSet a small, fixed size cache which holds the most popular pages for very quick access. This is an idea I used recently in mc_main.c (auxmap_L1/auxmap_L2) and it works pretty well. Anyway I volunteer to implement this (+ the lazy set union stuff) if you are interested in trying the danger-set scheme. J |
|
From: Julian S. <js...@ac...> - 2006-12-17 18:56:07
|
On Wednesday 13 December 2006 18:10, Bart Van Assche wrote: > A new patch is available: > http://home.euphonynet.be/bvassche/valgrind/valgrind-6397-drd-2006-12-13.pa >tch.gz Just looking through the patch again. Minor observation: does drd_vc.h need <stdint.h> ? pub_tool_basics gives you UInt, which is guaranteed to be a 32-bit unsigned int on all platforms. J |
|
From: Bart V. A. <bar...@gm...> - 2006-12-17 20:41:01
|
On 12/17/06, Julian Seward <js...@ac...> wrote: > On Thursday 14 December 2006 03:08, Nicholas Nethercote wrote: > > > Any comments on the proposed core changes and/or the drd tool are > > > welcome. > > > > These changes look great. I appreciate that you have addressed the > > concerns Julian and I have raised previously. The minimal set of > > coregrind/ changes is looking good. > > Yes, I also appreciate that. > > I presume you are building up a set of test programs as you go along? > Those will eventually be useful as regression tests. That's correct. But I already noticed that I will have to make changes to drd such that the outcome of regression tests is reproducible. E.g. drd currently prints absolute pathnames every time it prints the name of a sourcefile -- this should be a relative pathname. And even if drd detects the same set of data races during every run, these are not always printed in the same order. I'm still thinking about a solution for this. > > > Next issues I will work on: segment merging, and more accurate error > > > reporting. This should solve the OOM errors triggered by drd and make the > > > tool more usable. > > Do you plan / have any interest in trying the danger-set idea? I had > thought a bit about how to implement the lazy set merging involved. > It also avoids having two different pieces of code for 32-bit and > 64-bit machines. > > The idea is to remove the fixed 3-level bitmap and instead use an > OSet of pages (of some size). Naively, each memory access then > requires a lookup in the OSet, which could be slow. So, associate > with the OSet a small, fixed size cache which holds the most popular > pages for very quick access. This is an idea I used recently in > mc_main.c (auxmap_L1/auxmap_L2) and it works pretty well. > > Anyway I volunteer to implement this (+ the lazy set union stuff) > if you are interested in trying the danger-set scheme. The next item I'll work on is to implement the so-called danger set algorithm (see also the file drd/TODO.txt in the patch). I don't think it will be necessary to save an call stack with each memory access: with the current bitmap implementation it's probably faster to check for data races upon each memory access instead of storing a call stack per access and only reporting data races when a new segment is started. I'll have a look at the OSet data structure -- it certainly would be welcome if I wouldn't have to manage separate implementations for 32 and 64-bit architectures. Do you have an idea of how lookups in an OSet compare with lookups in the three-level bitmap data structure I defined ? And how about memory consumption ? Is there any documentation available about OSet's apart of Valgrind's source files ? Bart. |
|
From: Nicholas N. <nj...@cs...> - 2006-12-17 22:02:49
|
On Sun, 17 Dec 2006, Bart Van Assche wrote: > That's correct. But I already noticed that I will have to make changes > to drd such that the outcome of regression tests is reproducible. E.g. > drd currently prints absolute pathnames every time it prints the name > of a sourcefile -- this should be a relative pathname. And even if drd > detects the same set of data races during every run, these are not > always printed in the same order. I'm still thinking about a solution > for this. Note that each test has a "filter" that transforms the actual output into the expected .exp file. Most tests use the standard ones but you can use test-specific ones. You might be able to utilise that. > I'll have a look at the OSet data structure -- it certainly would be > welcome if I wouldn't have to manage separate implementations for 32 > and 64-bit architectures. Do you have an idea of how lookups in an > OSet compare with lookups in the three-level bitmap data structure I > defined ? And how about memory consumption ? Is there any > documentation available about OSet's apart of Valgrind's source files > ? No. But the comment at the top of coregrind/m_oset.c is reasonably detailed (overhead is 12 bytes on 32-bit machines, 24 bytes on 64-bit machines). Then if you use VG_(malloc) to allocate them you have some more overhead: 16 bytes on 32-bit machines, 32 bytes on 64-bit machines -- see the comment near the top of coregrind/m_mallocfree.c. Nick |
|
From: Julian S. <js...@ac...> - 2006-12-18 01:32:50
|
> and 64-bit architectures. Do you have an idea of how lookups in an > OSet compare with lookups in the three-level bitmap data structure I > defined ? They will certainly be a lot slower. So that's why I suggest to place a small cache of some sort "in front" of the main OSet - to catch and handle quickly 90+ % of accesses. > And how about memory consumption ? Probably pretty economical. I would say not noticeably different from the 3 level structure. J |