Re: [Valgrind-developers] run "perf" tests as part of the nightly job

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 09/08/2013 09:09 PM, Philippe Waroquiers wrote:
> Every night, Valgrind is compiled and regression tested on a bunch of
> platform. However, there is no continuous tracking of the performance.
> => it has been suggested to add in the nightly script a run of
> the performance tests.
> This mail discusses how we could set that up (in particular to decide
> how much additional cpu these perf tests can take). 
> 
> On a slow computer (pentium 4):
>   time perl perf/vg_perf --tools=none,memcheck,cachegrind,callgrind,helgrind,drd,exp-sgcheck,exp-dhat perf
> takes 32 minutes.
> On a faster computers such as gcc20, and on gcc110 (but without exp-sgcheck, crashing),
> it takes about 10 minutes.
> See details below.
> 
> If we want more reproducable nrs, we have to give a --reps=... arg.
> Then to have a comparison with previous day, it further doubles the nr of run.
> So if we use e.g. --reps=3, this would mean about 3 hours of cpu
> on a slow computer, and 1 hour on faster computers.

The problem with this approach is that you don't notice performance
degradation that creeps in on you. Say 1% a day for several days in a
row. 1% degradation would not get any attention but when it accumulates
over time, it should.
What if we run perf every day and send the results to e.g
pe...@va...? Whenever mail is received at that address a little
script runs that reads the perf results and collects them in some light
weight "data base" that we could look at at valgrind.org/perf. It could
be as simple as a HTML table (per platform) with one row per day that
shows, for each tool, the difference to the previous run and the
difference to some base line run.
Yes, this is more work to set up than just sending the results to the
developers list. But the results might get more attention that way.

> So, a few questions:
> 1. how much --reps ? Is 3 ok ?

I typically use --reps=5 but if I had a better handle on variations or a
better tool to measure runtime than "time" less repetition mught be
possible.

> 2. do we run perf for all tools ?
>    or only for non-experiment tools ?
>    or for even less tools (only none and memcheck ?)

Non-experimental tools should be enough.

    Florian