|
From: Bart V. A. <bar...@gm...> - 2007-02-20 12:03:53
|
Hello,
I started again working on the drd tool where I stopped last December:
analyzing the cause of false positives caused by signal delivery. These
false positives might be caused by the way the Valgrind core delivers
signals to clients. I have written a small client program to analyze this
issue (see also attachment). Basically it does the following:
- main thread:
* install a signal handler for SIGALRM.
* create thread 2
* wait until thread 2 is running
* send SIGALRM to thread 2 (using pthread_kill).
* join thread 2
- thread 2:
* call clock_nanosleep() with an interval length of 10s. This call gets
interrupted by the pthread_kill() invoked by the main thread, and it is on
the context of this thread that the signal handler gets called.
When I run this client program through drd then a data race is reported on
the arguments passed to the signal handler, which I do not understand. I
started analyzing this and made the following observations:
1. The first time drd_trace_store() reports an access to location 0x4aa9dc0
Valgrind's core has set the thread ID to 1. This can't be correct -- if you
look at the call stack, you can see that this is a call stack of thread 2.
Further analysis has shown that tracing this store action is triggered by
vgPlain_sigframe_create() (I have inserted a tl_assert(0) in the store trace
code).
2. At the time the signal handler is called, apparently the thread ID is
(correctly) set to 2.
3. drd reports a false positive because Valgrind's core had told it that a
store and load have been performed on the same location but with different
thread ID's.
My questions to the other Valgrind developers are:
- Do you agree with my analysis ?
- If so, would it be hard to make sure that the thread ID is set correctly
before vgPlain_sigframe_create() is called ?
Full drd output:
./vg-in-place --tool=drd --trace-address=$(echo $((0x4aa9dc0)))
drd/tests/sigalrm
==20274== drd, a data race detector.
==20274== Copyright (C) 2006, and GNU GPL'd, by Bart Van Assche.
THIS SOFTWARE IS A PROTOTYPE, AND IS NOT YET RELEASED
==20274== Using LibVEX rev 1680, a library for dynamic binary translation.
==20274== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==20274== Using valgrind-3.3.0.SVN, a dynamic binary instrumentation
framework.
==20274== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==20274== For more details, rerun with: -v
==20274==
main: kernel thread ID 20274 / Valgrind thread ID 1
thread: kernel thread ID 20275 / Valgrind thread ID 2
==20274== store 0x4aa9dc0 size 720 thread 1
==20274== at 0x405F536: clock_nanosleep (in /lib/librt-2.5.so)
==20274== by 0x8048BE2: thread_func(void*) (sigalrm.cpp:50)
==20274== by 0x4024E8C: vg_thread_wrapper (drd_preloaded.c:133)
==20274== by 0x4048111: start_thread (in /lib/libpthread-2.5.so)
==20274== by 0x423A2ED: clone (in /lib/libc-2.5.so)
==20274== load 0x4aa9dc0 size 4 thread 2
==20274== at 0x8048974: SignalHandler(int) (sigalrm.cpp:42)
==20274== by 0x8048BE2: thread_func(void*) (sigalrm.cpp:50)
==20274== by 0x4024E8C: vg_thread_wrapper (drd_preloaded.c:133)
==20274== by 0x4048111: start_thread (in /lib/libpthread-2.5.so)
==20274== by 0x423A2ED: clone (in /lib/libc-2.5.so)
==20274== Thread 2:
==20274== Conflicting load by thread 2 at 0x04aa9dc0 size 4
==20274== at 0x8048974: SignalHandler(int) (sigalrm.cpp:42)
==20274== by 0x8048BE2: thread_func(void*) (sigalrm.cpp:50)
==20274== by 0x4024E8C: vg_thread_wrapper (drd_preloaded.c:133)
==20274== by 0x4048111: start_thread (in /lib/libpthread-2.5.so)
==20274== by 0x423A2ED: clone (in /lib/libc-2.5.so)
==20274== Allocation context: stack of thread 2, offset -4672
==20274== Other segment start (thread 1)
==20274== at 0x423A2D8: clone (in /lib/libc-2.5.so)
==20274== by 0x4048818: pthread_create@@GLIBC_2.1 (in /lib/libpthread-
2.5.so)
==20274== by 0x4024C5A: pthread_create@* (drd_preloaded.c:191)
==20274== by 0x8048AE8: main (sigalrm.cpp:69)
==20274== Other segment end (thread 1)
==20274== at 0x4049517: pthread_join (in /lib/libpthread-2.5.so)
==20274== by 0x4023FDE: pthread_join (drd_preloaded.c:220)
==20274== by 0x8048B3F: main (sigalrm.cpp:75)
==20274== end 0x4aa9dc0 size 4 thread 2
==20274== at 0x8048974: SignalHandler(int) (sigalrm.cpp:42)
==20274== by 0x8048BE2: thread_func(void*) (sigalrm.cpp:50)
==20274== by 0x4024E8C: vg_thread_wrapper (drd_preloaded.c:133)
==20274== by 0x4048111: start_thread (in /lib/libpthread-2.5.so)
==20274== by 0x423A2ED: clone (in /lib/libc-2.5.so)
==20274==
==20274== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 15 from 5)
This is the call stack at the time of the first "trace store" action in the
above output:
==19613== by 0x3800248F: drd_trace_store (drd_main.c:182)
==19613== by 0x38002576: drd_post_mem_write (drd_main.c:265)
==19613== by 0x3805F9F0: vgPlain_sigframe_create (sigframe-x86-linux.c
:494)
==19613== by 0x380188B4: deliver_signal (m_signals.c:1053)
==19613== by 0x38018A79: async_signalhandler (m_signals.c:1605)
==19613== by 0x380176EF: (within
/home/bart/software/valgrind-svn/drd/drd-x86-linux)
Regards,
Bart Van Assche.
|