|
From: Arndt M. <amu...@is...> - 2007-02-27 10:02:03
Attachments:
amuehlen.vcf
|
Yes it's me! The one who is still using Helgrind (valgrind 2.2.0). Yesterday I did some tests to evaluate the performance of Helgrind and now I am a bit puzzled. The application I used for the test is a UDP server that receives requests, processes it and sends back a reply. Without valgrind the application processes approx. 145 cps (calls-per-second) on my system. When run with valgrind --tool=none the number is reduced to 17 cps, which fits the 8-10x slowdown I read somewhere. But running valgrind with tool helgrind I get only 0.5 cps! This is a slowdown of ~300! I always thought the slowdown is in the range of 20-30. Now I would like to know if anybody has results from performance tests that affirm or disprove this result. Thanks, Arndt |
|
From: Nicholas N. <nj...@cs...> - 2007-02-27 10:32:21
|
On Tue, 27 Feb 2007, Arndt Muehlenfeld wrote: > Yes it's me! > The one who is still using Helgrind (valgrind 2.2.0). > Yesterday I did some tests to evaluate the performance of Helgrind > and now I am a bit puzzled. > The application I used for the test is a UDP server that receives requests, > processes it and sends back a reply. > Without valgrind the application processes approx. 145 cps (calls-per-second) > on my system. > When run with valgrind --tool=none the number is reduced to 17 cps, > which fits the 8-10x slowdown I read somewhere. > But running valgrind with tool helgrind I get only 0.5 cps! > This is a slowdown of ~300! > I always thought the slowdown is in the range of 20-30. > > Now I would like to know if anybody has results from performance tests > that affirm or disprove this result. There was a paper at the ASPLOS conference last year about a race detector, they compared it to Helgrind and found Helgrind was about that slow. Memcheck is 20--30x slower, but it is a lot more tuned than Helgrind. Nick |
|
From: Bart V. A. <bar...@gm...> - 2007-02-27 10:45:08
|
On 2/27/07, Nicholas Nethercote <nj...@cs...> wrote: > On Tue, 27 Feb 2007, Arndt Muehlenfeld wrote: > > There was a paper at the ASPLOS conference last year about a race detector, > they compared it to Helgrind and found Helgrind was about that slow. > > Memcheck is 20--30x slower, but it is a lot more tuned than Helgrind. Are you referring to this paper: http://valgrind.org/docs/muehlenfeld2006.pdf ? Bart. |
|
From: Nicholas N. <nj...@cs...> - 2007-02-27 12:23:03
|
On Tue, 27 Feb 2007, Bart Van Assche wrote: >> There was a paper at the ASPLOS conference last year about a race detector, >> they compared it to Helgrind and found Helgrind was about that slow. >> >> Memcheck is 20--30x slower, but it is a lot more tuned than Helgrind. > > Are you referring to this paper: http://valgrind.org/docs/muehlenfeld2006.pdf > ? No. The paper's called "AVIO: Detecting atomicity violations via access interleaving invariants", by Lu, Tucek, Qin and Zhou. They measured Helgrind on four programs and got an average slow-down of 694x. Nick |
|
From: Bart V. A. <bar...@gm...> - 2007-02-27 12:33:06
|
On 2/27/07, Nicholas Nethercote <nj...@cs...> wrote: > > No. The paper's called "AVIO: Detecting atomicity violations via access > interleaving invariants", by Lu, Tucek, Qin and Zhou. They measured > Helgrind on four programs and got an average slow-down of 694x. Thanks. By this time I found a copy at the following location: http://www.cse.ohio-state.edu/~qin/pub-papers/2006andbefore/asplos062-lu.pdf Bart. |
|
From: Arndt M. <amu...@is...> - 2007-02-27 12:46:56
Attachments:
amuehlen.vcf
|
Bart Van Assche schrieb: > On 2/27/07, Nicholas Nethercote <nj...@cs...> wrote: > >> No. The paper's called "AVIO: Detecting atomicity violations via access >> interleaving invariants", by Lu, Tucek, Qin and Zhou. They measured >> Helgrind on four programs and got an average slow-down of 694x. >> > > Thanks. By this time I found a copy at the following location: > http://www.cse.ohio-state.edu/~qin/pub-papers/2006andbefore/asplos062-lu.pdf > > scholar.google.com found it,too. Thanks for the title, Nicholas. It is a pity that they didn't provide timing results from Apache and MySQL. However, 694x is much. Looks like there is room for optimization... Arndt |
|
From: Arndt M. <amu...@is...> - 2007-02-28 14:28:19
Attachments:
amuehlen.vcf
|
Nicholas Nethercote schrieb: > On Tue, 27 Feb 2007, Bart Van Assche wrote: > >>> There was a paper at the ASPLOS conference last year about a race >>> detector, >>> they compared it to Helgrind and found Helgrind was about that slow. >>> >>> Memcheck is 20--30x slower, but it is a lot more tuned than Helgrind. >> >> Are you referring to this paper: >> http://valgrind.org/docs/muehlenfeld2006.pdf ? > > No. The paper's called "AVIO: Detecting atomicity violations via > access interleaving invariants", by Lu, Tucek, Qin and Zhou. They > measured Helgrind on four programs and got an average slow-down of 694x. Unfortunately, the authors did neither say what parameters they used for the benchmarks nor how many cores (4?) they used. I tried fft, one of the SPLASH-2 kernel benchmarks and got the following results (3GHz P4, 1G memory): fft -m22 2.7s valgrind --tool=none fft -m22 28.2s (10.4x) valgrind --tool=helgrind fft -m22 221.5s (82.0x) I tried every value for -p (number of threads) less than 100 with no change. In the paper they claim that the overhead for fft is 1217x! The memory usage with Helgrind is doubled, 392M instead of 197M, therefore when I tried -m24, Helgrind used 1.5G instead of 768M which resulted in horrible swapping, probably giving a factor of 1000+, but I don't know exactly, because I interrupted the test after a couple of minutes. Other tests I did showed an overhead between 30x and 50x. I think the reported overhead of 694x is too high for applications that fit in memory. I will investigate tomorrow whether my results with the network server have the same reason: swapping due to increased memory usage. Arndt |
|
From: Arndt M. <amu...@is...> - 2007-03-01 11:41:59
Attachments:
amuehlen.vcf
|
Arndt Muehlenfeld schrieb: > Nicholas Nethercote schrieb: >> On Tue, 27 Feb 2007, Bart Van Assche wrote: >> >>>> There was a paper at the ASPLOS conference last year about a race >>>> detector, >>>> they compared it to Helgrind and found Helgrind was about that slow. >>>> >>>> Memcheck is 20--30x slower, but it is a lot more tuned than Helgrind. >>> >>> Are you referring to this paper: >>> http://valgrind.org/docs/muehlenfeld2006.pdf ? >> >> No. The paper's called "AVIO: Detecting atomicity violations via >> access interleaving invariants", by Lu, Tucek, Qin and Zhou. They >> measured Helgrind on four programs and got an average slow-down of 694x. > Unfortunately, the authors did neither say what parameters they used > for the benchmarks nor how many cores (4?) they used. > I tried fft, one of the SPLASH-2 kernel benchmarks and got the > following results (3GHz P4, 1G memory): > fft -m22 > 2.7s > valgrind --tool=none fft -m22 > 28.2s (10.4x) > valgrind --tool=helgrind fft -m22 > 221.5s (82.0x) > I tried every value for -p (number of threads) less than 100 with no > change. > Here I was wrong yesterday, due to a bug in my test-script. In fact, the runtime for Helgrind depends on the number of threads, but even with 32 threads the value is just 8m 1.5s (188x), which is still far from the claimed 1217x of the paper. (Although VG_N_THREADS is set to 100, Helgrind doesn't work with 64 threads.) Another possible reason for higher overhead values (besides exhausting physical memory) could be setting --error-limit=no, because Helgrind finds thousands of false positives for the benchmark and reporting them all is slowing down the program significantly. Arndt |