|
From: Philippe W. <phi...@sk...> - 2012-03-08 20:13:23
|
On Thu, 2012-03-08 at 19:43 +0000, Bart Van Assche wrote: > On 02/22/12 22:05, Julian Seward wrote: > >> I have annotated the pipe lock and the futex lock with RWLOCK > >> annotations. With this, helgrind detects some (but not many) possible > >> data races (on the current trunk i.e. the "single threaded" version). > > Quick question -- can you post some of the races? I am interested > > to see what it found. > > The number of races reported should have been reduced significantly for > r12437. I haven't analyzed the remaining reports yet. Here is an example > of what is still reported with drd as outer and as inner tool and for > client program drd/tests/tsan_unittest 3: Nice work. In some cases, the outer valgrind is detecting "bugs" in the program executed by the inner valgrind. At least this is the conclusion I had on analysing in depth a race condition reported on running the parallel sleeper test : the address given in the race condition reported by the outer valgrind was the address of a variable of the program executed by the inner valgrind, and this variable was effectively not properly protected. As the code JIT-ted by the inner Valgrind containing no debug info, the outer Valgrind cannot make a proper stack trace for it. No idea if the below is the same case. A side note on outer/inner activities I am working on: I am currently having headaches running the regression tests in an outer/inner setup: all the 32 bits tests are failing when running on a 64 bits bi-arch platform : something nasty in the aspacemgr (same tests are working fine on a 32 bits fedora x86). Apart of this, the 64 bits tests are working reasonably well. If I cannot solve the 32 bits on 64 bits problem this week-end, I will commit in the current state. Philippe > > ==29763== Conflicting store by thread 1 at 0x00638664 size 4 > ==29763== at 0x280C8945: ??? (syscall-amd64-linux.S:147) > ==29763== by 0x7: ??? > ==29763== by 0x3F66D0E1F: ??? > ==29763== by 0x3F66D0E2F: ??? > ==29763== by 0x28C6E97F: ??? > ==29763== by 0xC9: ??? > ==29763== by 0xC9: ??? > ==29763== by 0x2902D17F: ??? > ==29763== by 0xAF: ??? > ==29763== Allocation context: BSS section of > /home/bart/software/valgrind.git/drd/tests/tsan_unittest > ==29763== Other segment start (thread 2) > ==29763== at 0x280C87E7: vgModuleLocal_sema_up (sema.c:144) > ==29763== by 0x2807A328: vgPlain_release_BigLock (scheduler.c:302) > ==29763== by 0x2807CDEB: vgPlain_client_syscall (syswrap-main.c:1470) > ==29763== by 0x28079CCF: handle_syscall (scheduler.c:957) > ==29763== by 0x2807AEC9: vgPlain_scheduler (scheduler.c:1179) > ==29763== by 0x2808AD0E: run_a_thread_NORETURN (syswrap-linux.c:102) > ==29763== by 0x2808B05A: vgModuleLocal_start_thread_NORETURN > (syswrap-linux.c:290) > ==29763== by 0x280A8A7D: ??? (in > /home/bart/software/valgrind/drd/drd-amd64-linux) > > And this is what is reported for the same program with helgrind as outer > and none as inner tool: > > ==30505== ---Thread-Announcement------------------------------------------ > ==30505== > ==30505== Thread #2 was created > ==30505== at 0x280878C2: ??? (in > /home/bart/software/valgrind/none/none-amd64-linux) > ==30505== by 0x2808AD08: vgSysWrap_amd64_linux_sys_clone_before > (syswrap-amd64-linux.c:306) > ==30505== by 0x2805BB57: vgPlain_client_syscall (syswrap-main.c:1382) > ==30505== by 0x28058B1F: handle_syscall (scheduler.c:957) > ==30505== by 0x28059D19: vgPlain_scheduler (scheduler.c:1179) > ==30505== by 0x28069B5E: run_a_thread_NORETURN (syswrap-linux.c:102) > ==30505== > ==30505== ---Thread-Announcement------------------------------------------ > ==30505== > ==30505== Thread #1 is the program's root thread > ==30505== > ==30505== ---------------------------------------------------------------- > ==30505== > ==30505== Lock at 0x3F18011B0 was first observed > ==30505== at 0x280B150E: vgModuleLocal_sema_init (sema.c:79) > ==30505== by 0x2805ACFF: create_sched_lock (sched-lock-generic.c:55) > ==30505== by 0x28058DA2: init_BigLock (scheduler.c:308) > ==30505== by 0x28059568: vgPlain_scheduler_init_phase1 (scheduler.c:566) > ==30505== by 0x2801DEBA: valgrind_main (m_main.c:2013) > ==30505== by 0x28021755: _start_in_C_linux (m_main.c:2799) > ==30505== by 0x2801C510: ??? (in > /home/bart/software/valgrind/none/none-amd64-linux) > ==30505== > ==30505== Possible data race during write of size 4 at 0x638664 by thread #2 > ==30505== Locks held: 1, at address 0x3F18011B0 > ==30505== at 0x3F65C8730: ??? > ==30505== by 0x183CA: ??? > ==30505== by 0x28C3C72F: ??? > ==30505== by 0x3F800FF4F: ??? > ==30505== by 0x3F800FEBF: ??? > ==30505== by 0x28C3C71F: ??? > ==30505== by 0xD968621: pthread_cond_signal@@GLIBC_2.3.2 > (pthread_cond_signal.S:52) > ==30505== > ==30505== This conflicts with a previous read of size 4 by thread #1 > ==30505== Locks held: none > ==30505== at 0x280B1AF5: ??? (syscall-amd64-linux.S:147) > ==30505== by 0x7: ??? > ==30505== by 0x3F653EE1F: ??? > ==30505== by 0x3F653EE2F: ??? > ==30505== by 0x28C3AE9F: ??? > ==30505== by 0xC9: ??? > ==30505== by 0xC9: ??? > ==30505== by 0x28FF969F: ??? > > Bart. |