|
From: Arndt M. <amu...@is...> - 2007-03-01 11:41:59
|
Arndt Muehlenfeld schrieb: > Nicholas Nethercote schrieb: >> On Tue, 27 Feb 2007, Bart Van Assche wrote: >> >>>> There was a paper at the ASPLOS conference last year about a race >>>> detector, >>>> they compared it to Helgrind and found Helgrind was about that slow. >>>> >>>> Memcheck is 20--30x slower, but it is a lot more tuned than Helgrind. >>> >>> Are you referring to this paper: >>> http://valgrind.org/docs/muehlenfeld2006.pdf ? >> >> No. The paper's called "AVIO: Detecting atomicity violations via >> access interleaving invariants", by Lu, Tucek, Qin and Zhou. They >> measured Helgrind on four programs and got an average slow-down of 694x. > Unfortunately, the authors did neither say what parameters they used > for the benchmarks nor how many cores (4?) they used. > I tried fft, one of the SPLASH-2 kernel benchmarks and got the > following results (3GHz P4, 1G memory): > fft -m22 > 2.7s > valgrind --tool=none fft -m22 > 28.2s (10.4x) > valgrind --tool=helgrind fft -m22 > 221.5s (82.0x) > I tried every value for -p (number of threads) less than 100 with no > change. > Here I was wrong yesterday, due to a bug in my test-script. In fact, the runtime for Helgrind depends on the number of threads, but even with 32 threads the value is just 8m 1.5s (188x), which is still far from the claimed 1217x of the paper. (Although VG_N_THREADS is set to 100, Helgrind doesn't work with 64 threads.) Another possible reason for higher overhead values (besides exhausting physical memory) could be setting --error-limit=no, because Helgrind finds thousands of false positives for the benchmark and reporting them all is slowing down the program significantly. Arndt |