|
From: Bart V. A. <bar...@gm...> - 2006-12-30 15:06:46
|
A new drd version is available at http://home.euphonynet.be/bvassche/valgrind/valgrind-6458-drd-2006-12-30.patch.gz. Core changes since last version (2006-12-13): * oset comparison and lookup functions now accept const arguments without compiler warning. * added function VG_(memmove)(). * added function VG_(thread_get_stack_max)(const ThreadId tid). * pub_tool_vki.h has no longer to be included before include/pub_tool_libcfile.h is included. drd changes since last version (2006-12-13): * changed the bitmap representation from a three-level bitmap to oset + bitmap. * added support for spinlocks. * state information is now kept for POSIX condition variables. This allows a.o. to print an error message when pthread_cond_signal() or pthread_cond_broadcast() is called without locking the correct mutex from the signalling thread. * drd now keeps track of stack bounds per thread -- this information is updated each time the stack pointer changes. * memory accesses by system calls are now also taken into account for data race detection. * re-enabled tracking of dynamic memory allocations. When a conflict is detected on dynamically allocated memory, the call stack of the allocation is printed as well. * conflicting accesses to the top of the stack (where NPTL keeps its 'struct pthread' datastructure) are suppressed. * implemented "danger set" algorithm. The danger set is defined as the set containing all accesses of all segments of all threads other than the running thread that are unordered to the active segment of the running thread. This danger set is updated at each context switch, each time vector clocks are combined and each time memory is freed (via free(), delete or by increasing the stack pointer). * conflicting memory accesses are now reported immediately. The advantage is that an exact call stack can be printed for each conflicting access. * error suppressions now work in drd. * attempted to write a suppression file that suppresses some data races present in ld.so / libc / libstd++ / libpthread (drd/default.supp). I learned that suppressing data races by callstack probably won't be too coarse for the drd tool: several call stacks have to be included for accesses to the same location, and the danger exists that some of these call stacks suppress data races that should be reported to the user. * added command-line option for printing statistics about internal drd activity. * added command-line option for tracing all loads and stores to a certain address. This can be a big help for finding out the meaning of printed data addresses. Summarized: although one important algorithm change is still needed (segment merging), the tool is already useful for finding out which data races happened and why. But it will be a challenge to suppress all intended data races (in ld.so, glibc, libstdc++ and libpthread.so). Bart. |
|
From: Nicholas N. <nj...@cs...> - 2006-12-30 22:37:48
|
On Sat, 30 Dec 2006, Bart Van Assche wrote: > A new drd version is available at > http://home.euphonynet.be/bvassche/valgrind/valgrind-6458-drd-2006-12-30.patch.gz. > > * attempted to write a suppression file that suppresses some data races > present in ld.so / libc / libstd++ / libpthread (drd/default.supp). I > learned that suppressing data races by callstack probably won't be too > coarse for the drd tool: several call stacks have to be included for > accesses to the same location, and the danger exists that some of these call > stacks suppress data races that should be reported to the user. Did you mean "probably will be too coarse"? > Summarized: although one important algorithm change is still needed (segment > merging), the tool is already useful for finding out which data races > happened and why. But it will be a challenge to suppress all intended data > races (in ld.so, glibc, libstdc++ and libpthread.so). Good! I think data race detectors generally have a lower signal-to-noise ratio than eg. Memcheck, but it seems people will still be interested. Nick |
|
From: Bart V. A. <bar...@gm...> - 2006-12-31 09:02:50
|
On 12/30/06, Nicholas Nethercote <nj...@cs...> wrote: > > On Sat, 30 Dec 2006, Bart Van Assche wrote: > > > A new drd version is available at > > > http://home.euphonynet.be/bvassche/valgrind/valgrind-6458-drd-2006-12-30.patch.gz > . > > > > * attempted to write a suppression file that suppresses some data races > > present in ld.so / libc / libstd++ / libpthread (drd/default.supp). I > > learned that suppressing data races by callstack probably won't be too > > coarse for the drd tool: several call stacks have to be included for > > accesses to the same location, and the danger exists that some of these > call > > stacks suppress data races that should be reported to the user. > > Did you mean "probably will be too coarse"? Yes. > Summarized: although one important algorithm change is still needed > (segment > > merging), the tool is already useful for finding out which data races > > happened and why. But it will be a challenge to suppress all intended > data > > races (in ld.so, glibc, libstdc++ and libpthread.so). > > Good! I think data race detectors generally have a lower signal-to-noise > ratio than eg. Memcheck, but it seems people will still be interested. > > Nick > My own goal is still to make drd meet or exceed Intel's VTune. The Thread Checker included in VTune doesn't report any false positives as far as I know. Bart. |
|
From: Nicholas N. <nj...@cs...> - 2007-01-02 04:43:46
|
On Sun, 31 Dec 2006, Bart Van Assche wrote: > My own goal is still to make drd meet or exceed Intel's VTune. The Thread > Checker included in VTune doesn't report any false positives as far as I > know. Good to hear it! :) As you now know, there's a lot of work between "getting a tool working" and "getting a tool working well". And that work is the difference between something that is not bad vs. something that is really useful. Nick |
|
From: Bart V. A. <bar...@gm...> - 2007-01-06 12:23:25
|
On 1/2/07, Nicholas Nethercote <nj...@cs...> wrote: > > On Sun, 31 Dec 2006, Bart Van Assche wrote: > > > My own goal is still to make drd meet or exceed Intel's VTune. The > Thread > > Checker included in VTune doesn't report any false positives as far as I > > know. > > Good to hear it! :) As you now know, there's a lot of work between > "getting > a tool working" and "getting a tool working well". And that work is the > difference between something that is not bad vs. something that is really > useful. > I know there is still a lot of work to do before the drd tool will be working as well as the other Valgrind tools. But I might have been too negative about error suppression: the latest version (2007-01-02) error suppression works fairly well. Bart. |
|
From: Duncan S. <bal...@fr...> - 2007-01-02 10:52:18
|
Hi Bart, > A new drd version is available at > http://home.euphonynet.be/bvassche/valgrind/valgrind-6458-drd-2006-12-30.patch.gz. thanks for doing this. I gave it a whirl and it seems to work fine. However it reported a false positive: two threads, thread 1 and thread 2, write to the same memory location in an unsynchronized way, but they write *the same value* to it. Thread 1 then reads the value, which may have last been written by thread 1 or thread 2. This gets reported as a race, even though there is no race since the value read is independent of the order in which the threads wrote it: ==5219== Conflicting load by thread 1 at 0x04413398 size 4 ==5219== at 0x8075A40: system__task_primitives__operations__set_priority (s-taprop.adb:627) <= reads the value ==5219== by 0x8075C8B: system__task_primitives__operations__create_task (s-taprop.adb:787) <= wrote the value ==5219== by 0x80EA0BB: system__tasking__stages__activate_tasks (s-tassta.adb:344) ==5219== by 0x82287DF: notification_support__dispatch___elabb (notification_support-dispatch.adb:5) ==5219== by 0x804C256: adainit (b~smkr.adb:825) ==5219== by 0x804D4A2: main (b~smkr.adb:1619) ==5219== Allocation context: heap, offset 64 in block at 0x4413358 of size 1888 ==5219== at 0x4020516: malloc (vg_replace_malloc.c:207) ==5219== by 0x8073D5F: __gnat_malloc (s-memory.adb:92) ==5219== by 0x8075AF3: system__task_primitives__operations__new_atcb (s-taprop.adb:672) ==5219== by 0x80EA46E: system__tasking__stages__create_task (s-tassta.adb:563) ==5219== by 0x82275EB: notification_support__dispatch__message_taskTKVIP (notification_support-dispatch.adb:55) ==5219== by 0x82287CC: notification_support__dispatch___elabb (notification_support-dispatch.adb:73) ==5219== by 0x804C256: adainit (b~smkr.adb:825) ==5219== by 0x804D4A2: main (b~smkr.adb:1619) ==5219== Other segment start (thread 2) ==5219== at 0x433D3D8: clone (in /lib/tls/i686/cmov/libc-2.5.so) <= writes the value ==5219== Other segment end (thread 2) ==5219== at 0x40007F2: (within /lib/ld-2.5.so) ==5219== by 0x4022720: pthread_mutex_lock (drd_preloaded.c:284) ==5219== by 0x8075511: system__task_primitives__operations__write_lock__2 (s-taprop.adb:333) ==5219== by 0x80753DE: system__task_primitives__operations__lock_rts (s-taprop.adb:209) ==5219== by 0x8075A7C: system__task_primitives__operations__enter_task (s-taprop.adb:653) ==5219== by 0x80EAD08: system__tasking__stages__task_wrapper (s-tassta.adb:1006) ==5219== by 0x402214F: vg_thread_wrapper (drd_preloaded.c:133) ==5219== by 0x425131A: start_thread (in /lib/tls/i686/cmov/libpthread-2.5.so) ==5219== by 0x433D3ED: clone (in /lib/tls/i686/cmov/libc-2.5.so) The context is this: the value in question is the pthread_t thread id of the newly created thread. Both the parent and child need it. The parent gets the value from pthread_create, the child gets it from pthread_self. Once obtained, each of them writes it to a data structure shared between them, memory location 0x04413398 in the above example. After storing the thread id, parent and child call various functions that use it, reading the thread id from the shared data structure; for example system__task_primitives__operations__set_priority reads it (the read reported above). Any thoughts on this? Thanks again and best wishes, Duncan. |
|
From: Nicholas N. <nj...@cs...> - 2007-01-02 21:40:57
|
On Mon, 1 Jan 2007, Duncan Sands wrote: > thanks for doing this. I gave it a whirl and it seems to work fine. > However it reported a false positive: two threads, thread 1 and thread 2, > write to the same memory location in an unsynchronized way, but they write > *the same value* to it. Thread 1 then reads the value, which may have last > been written by thread 1 or thread 2. This gets reported as a race, even > though there is no race since the value read is independent of the order > in which the threads wrote it: That's a tricky case -- drd doesn't know if those two values will always be the same (in which case it shouldn't issue an error message) or if it just got lucky this time (in which case it should). Nick |
|
From: Bart V. A. <bar...@gm...> - 2007-01-02 13:00:01
|
On 1/1/07, Duncan Sands <bal...@fr...> wrote: > Hi Bart, > > > A new drd version is available at > > http://home.euphonynet.be/bvassche/valgrind/valgrind-6458-drd-2006-12-30.patch.gz. > > thanks for doing this. I gave it a whirl and it seems to work fine. However it > reported a false positive: two threads, thread 1 and thread 2, write to the same > memory location in an unsynchronized way, but they write *the same value* to it. > Thread 1 then reads the value, which may have last been written by thread 1 or > thread 2. This gets reported as a race, even though there is no race since the > value read is independent of the order in which the threads wrote it: It wouldn't be easy to recognize this pattern in the drd tool. Is it possible to generate a suppression pattern with Valgrind's option --gen-suppressions=all ? If it is the conflicting accesses are triggered by the Ada implementation, I can add them to the default drd suppression file. Bart. |
|
From: Duncan S. <bal...@fr...> - 2007-01-02 13:50:19
|
On Tuesday 2 January 2007 13:59, Bart Van Assche wrote: > On 1/1/07, Duncan Sands <bal...@fr...> wrote: > > Hi Bart, > > > > > A new drd version is available at > > > http://home.euphonynet.be/bvassche/valgrind/valgrind-6458-drd-2006-12-30.patch.gz. > > > > thanks for doing this. I gave it a whirl and it seems to work fine. However it > > reported a false positive: two threads, thread 1 and thread 2, write to the same > > memory location in an unsynchronized way, but they write *the same value* to it. > > Thread 1 then reads the value, which may have last been written by thread 1 or > > thread 2. This gets reported as a race, even though there is no race since the > > value read is independent of the order in which the threads wrote it: > > It wouldn't be easy to recognize this pattern in the drd tool. Drat. > Is it > possible to generate a suppression pattern with Valgrind's option > --gen-suppressions=all ? If it is the conflicting accesses are > triggered by the Ada implementation, I can add them to the default drd > suppression file. It is indeed in the Ada implementation. The following seems to work: --- default.supp 2007-01-02 14:18:33.000000000 +0100 +++ /usr/local/lib/valgrind/drd/default.supp 2007-01-02 14:42:49.000000000 +0100 @@ -188,3 +188,10 @@ drd:ConflictingAccess fun:_pthread_cleanup_push_defer } +{ + gnat@create_task + drd:ConflictingAccess + fun:system__task_primitives__operations__set_priority + fun:system__task_primitives__operations__create_task + fun:system__tasking__stages__activate_tasks +} Best wishes, Duncan. |
|
From: Duncan S. <bal...@fr...> - 2007-01-02 14:17:11
|
> ... Is it possible to generate a suppression pattern with Valgrind's option > --gen-suppressions=all ? If it is the conflicting accesses are > triggered by the Ada implementation, I can add them to the default drd > suppression file. I've just discovered that the suppression I sent is almost useless, since whether it works or not depends strongly on the compiler version, i.e. while it works for me it most likely will not work for anyone else on the planet... I can try to generate multiple suppressions that cover all known compiler versions, but I'd rather not do that right now. Ciao, Duncan. |
|
From: Julian S. <js...@ac...> - 2007-01-02 14:40:56
|
On Tuesday 02 January 2007 14:16, Duncan Sands wrote: > > ... Is it possible to generate a suppression pattern with Valgrind's > > option --gen-suppressions=all ? If it is the conflicting accesses are > > triggered by the Ada implementation, I can add them to the default drd > > suppression file. > > I've just discovered that the suppression I sent is almost useless, since > whether it works or not depends strongly on the compiler version, i.e. > while it works for me it most likely will not work for anyone else on Yes - suppressions are like that. You can use wildcards (* and ?) in function names, and/or match on object names rather than function names, in an attempt to make the suppressions more robust to small variants in compiler versions. This often works pretty well, at least it has for writing suppressions for memcheck. Have a look in the top level glibc-2.4.supp file for various examples. J |