Thread: [Valgrind-users] Optimization flags to speed unit tests running under valgrind

Brought to you by: njn, sewardj, wielaard

valgrind-users

[Valgrind-users] Optimization flags to speed unit tests running under valgrind

From: John C. <joh...@ta...> - 2018-11-05 01:19:27

We run our suite of unit tests under valgrind.... Something which is
immensely helpful.

However as suite grows and grows.... it is starting to take significant
time, even when spread across multiple cores.

What techniques can we employ to speed things up (whilst retaining most of
the value)?

For example, what optimization flags (for compiling the unit tests) have
been found helpful? (We want the sum of compilation and link and unit test
run time to be minimized)

Are there any valgrind flags that would speed things up?

Thanks!

-- 
John Carter
Phone : (64)(3) 358 6639
Tait Electronics
PO Box 1645 Christchurch
New Zealand

-- 
This Communication is Confidential. We only send and receive email on the

basis of the terms set out at www.taitradio.com/email_disclaimer 
<http://www.taitradio.com/email_disclaimer>

Re: [Valgrind-users] Optimization flags to speed unit tests running under valgrind

From: John R. <jr...@bi...> - 2018-11-05 04:37:02

> What techniques can we employ to speed things up (whilst retaining most of the value)?

In general, memcheck runs faster when it emulates fewer instructions.  So compile with
-O2 to get smaller code size.  The flag -Os means "prefer small code size" but is
little used, so there just might be more compiler bugs.  Secondly, do whatever
it takes to *avoid* inlining string functions such as strlen etc.  You want memcheck
to replace the entire logical function call with a memcheck internal equivalent.
Unfortunately the gcc flags -fno-builtin* have documentation that is hard to understand.

The memcheck option --expensive-definedness-checks= already defaults to 'no'.
Specifying --redzone-size=8 might save 16 bytes of memory for each allocation,
which helps if there are many small allocations.  But default --alignment=16
is required for common SSE2 instructions [used by other glibc routines on the blocks,
etc.], and the redzone is not the only per-block overhead (see --keep-stacktraces=
and --num-callers=), so experimentation may be required.

If all you care about is memory leaks, then experiment with --undef-value-errors=no
and even non-valgrind tools [such as mtrace (malloc trace)] that are specialized
for detecting leaks.

Re: [Valgrind-users] Optimization flags to speed unit tests running under valgrind

From: Philippe W. <phi...@sk...> - 2018-11-05 20:41:01

On Sun, 2018-11-04 at 20:36 -0800, John Reiser wrote:
> > What techniques can we employ to speed things up (whilst retaining most of the value)?

> The memcheck option --expensive-definedness-checks= already defaults to 'no'.
Note that it defaults to 'auto' in 3.14.

> Specifying --redzone-size=8 might save 16 bytes of memory for each allocation,
> which helps if there are many small allocations.  But default --alignment=16
> is required for common SSE2 instructions [used by other glibc routines on the blocks,
> etc.], and the redzone is not the only per-block overhead (see --keep-stacktraces=
> and --num-callers=), so experimentation may be required.
> 
> If all you care about is memory leaks, then experiment with --undef-value-errors=no
> and even non-valgrind tools [such as mtrace (malloc trace)] that are specialized
> for detecting leaks.

You might look at the FOSDEM presentation
 'Tuning Valgrind for your Workload
     Hints, tricks and tips to effectively use Valgrind on small or big     
     applications'
https://archive.fosdem.org/2015/schedule/event/valgrind_tuning/

for other suggestions.

Philippe

Re: [Valgrind-users] Optimization flags to speed unit tests running under valgrind

From: John C. <joh...@ta...> - 2018-11-05 23:18:04

Thanks for the replies...

Tweaking the optimization settings did very little for the running time.

However a huge difference (factor of 2) comes from taking out
--show-reachable=yes --track-origins=yes except they are very very
useful... so I sort of don't want to.


On Tue, Nov 6, 2018 at 9:43 AM Philippe Waroquiers <
phi...@sk...> wrote:

> On Sun, 2018-11-04 at 20:36 -0800, John Reiser wrote:
> > > What techniques can we employ to speed things up (whilst retaining
> most of the value)?
>
> > The memcheck option --expensive-definedness-checks= already defaults to
> 'no'.
> Note that it defaults to 'auto' in 3.14.
>
> > Specifying --redzone-size=8 might save 16 bytes of memory for each
> allocation,
> > which helps if there are many small allocations.  But default
> --alignment=16
> > is required for common SSE2 instructions [used by other glibc routines
> on the blocks,
> > etc.], and the redzone is not the only per-block overhead (see
> --keep-stacktraces=
> > and --num-callers=), so experimentation may be required.
> >
> > If all you care about is memory leaks, then experiment with
> --undef-value-errors=no
> > and even non-valgrind tools [such as mtrace (malloc trace)] that are
> specialized
> > for detecting leaks.
>
> You might look at the FOSDEM presentation
>  'Tuning Valgrind for your Workload
>      Hints, tricks and tips to effectively use Valgrind on small or big
>
>      applications'
> https://archive.fosdem.org/2015/schedule/event/valgrind_tuning/
>
> for other suggestions.
>
> Philippe
>
>
>
> _______________________________________________
> Valgrind-users mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-users
>


-- 
John Carter
Phone : (64)(3) 358 6639
Tait Electronics
PO Box 1645 Christchurch
New Zealand

-- 
This Communication is Confidential. We only send and receive email on the

basis of the terms set out at www.taitradio.com/email_disclaimer 
<http://www.taitradio.com/email_disclaimer>

Re: [Valgrind-users] Optimization flags to speed unit tests running under valgrind

From: Julian S. <js...@ac...> - 2018-11-06 08:13:27

On 05/11/18 01:49, John Carter wrote:

> For example, what optimization flags (for compiling the unit tests) have
> been found helpful? 

What flags are you using at the moment, for your unit tests?

I have tended to use -Og -g as a reasonable tradeoff between debuggability
and performance (which is its intended aim anyway).  With gcc this works
quite well.  I've had more mixed performance results with clang using -g -Og.

Probably the most important thing is to avoid building your test cases with
-O0 (that is, no optimisation at all).  That causes gcc, at least, to produce
very poor code, involving many unnecessary memory references, which makes
Memcheck run very slowly.  Even -Og, which is the least level of optimisation
one can ask for above "none", drastically reduces memory traffic and
thereby makes Memcheck run significantly faster.

J