Thread: [Valgrind-users] Pre-instrumented vs JIT instrumentation

Brought to you by: njn, sewardj, wielaard

valgrind-users

[Valgrind-users] Pre-instrumented vs JIT instrumentation

From: Mike B. <mbr...@vi...> - 2003-06-05 22:22:41

I am new to valgrind, but I have used Rational Quantify.  Excuse if this is
a FAQ, but I could not find anything in the FAQ nor in the mailing list
archive.

I am curious why valgrind chooses to instrument on-the-fly every time the
application is executed rather than instrumenting the application ahead of
time and reexecuting the same instrumented executable each time, a la
Quantify.  To my perhaps naive mind it seems like instrumenting on-the-fly
is less efficient.  Can someone lend some insight?

Mike

Re: [Valgrind-users] Pre-instrumented vs JIT instrumentation

From: Alex I. <ale...@in...> - 2003-06-05 22:41:34

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
Valgrind, unlike Rational suite tools, does not "instrument" anything.&nbsp;
That's the beauty of it - you don't need to recompile (or in case ot Quantify
re-link) the application.&nbsp; The difference is that Rational tools embed
extra code inside your object files and libraries - for memory or performance
profiling. As a result, executable size, shared libraries sizes are huge
and executiion suffers greatly (esp. with purify). Valgrind, on the other
hand, runs your application on a synthetic CPU, all the machine instructions
of your app are executed by Valgrind on this CPU and memory/performance
profiling is done there, not in your app code.&nbsp; As far as Quantify
comparison, cachegrind sking gives you much more information - e.g. Quantify
does not report cache usage profile.
<p>Last but not least - Valgrind is free, has MUCH&nbsp;better developer
support, and is updated more frequently.
<br>Downside - only runs on x86 Linux, exactly because of synthetic CPU
- porting it to other platforms is too much work, so developers chose the
most popular platform.
<p>Regards,
<br>Alex
<p>Mike Bresnahan wrote:
<blockquote TYPE=CITE>I am new to valgrind, but I have used Rational Quantify.&nbsp;
Excuse if this is
<br>a FAQ, but I could not find anything in the FAQ nor in the mailing
list
<br>archive.
<p>I am curious why valgrind chooses to instrument on-the-fly every time
the
<br>application is executed rather than instrumenting the application ahead
of
<br>time and reexecuting the same instrumented executable each time, a
la
<br>Quantify.&nbsp; To my perhaps naive mind it seems like instrumenting
on-the-fly
<br>is less efficient.&nbsp; Can someone lend some insight?
<p>Mike
<p>-------------------------------------------------------
<br>This SF.net email is sponsored by:&nbsp; Etnus, makers of TotalView,
The best
<br>thread debugger on the planet. Designed with thread debugging features
<br>you've never dreamed of, try TotalView 6 free at www.etnus.com.
<br>_______________________________________________
<br>Valgrind-users mailing list
<br>Val...@li...
<br><a href="https://lists.sourceforge.net/lists/listinfo/valgrind-users">https://lists.sourceforge.net/lists/listinfo/valgrind-users</a></blockquote>

<pre>--&nbsp;
Alex G. Ivershen&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Inet Technologies, Inc.
Network Products Dept.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1500 N. Greenville Ave.
Inet Technologies Inc.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Richardson, TX 75081
Phone: +1-469-330-4295&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; USA

"Computer games don't affect kids; I mean if Pac-Man affected us as kids, we'd&nbsp;
all be running around in darkened rooms, munching magic pills and listening to
repetitive electronic music." -- Kristian Wilson, Nintendo, Inc</pre>
&nbsp;</html>

Re: [Valgrind-users] Pre-instrumented vs JIT instrumentation

From: Mike B. <mbr...@vi...> - 2003-06-05 23:00:30

Perhaps I misunderstand how valgrind works.  I have the understanding that,
using the synthetic CPU, valgrind instruments the application on the fly and
stores the instrumented code blocks in a cache.  The author describes this
process here http://developer.kde.org/~sewardj/docs-1.9.5/mc_techdocs.html.

Do I misunderstand?

Note that Quanitfy for Windows does not require a relink.  You just point it
at an executable and off it goes.  First it instruments the executable and
dependent DLLs and then it executes it.  The UNIX version requires a
relink - or at least it did when I last used it in the mid-1990's on HPUX.

Mike Bresnahan
----- Original Message ----- 
From: "Alex Ivershen" <ale...@in...>
To: "Mike Bresnahan" <mbr...@vi...>
Cc: <Val...@li...>
Sent: Thursday, June 05, 2003 5:41 PM
Subject: Re: [Valgrind-users] Pre-instrumented vs JIT instrumentation


> Valgrind, unlike Rational suite tools, does not "instrument" anything.
That's the beauty of it - you don't need to recompile (or in case ot
Quantify re-link) the application.  The difference is that Rational tools
embed extra code inside your object files and libraries - for memory or
performance profiling. As a result, executable size, shared libraries sizes
are huge and executiion suffers greatly (esp. with purify). Valgrind, on the
other hand, runs your application on a synthetic CPU, all the machine
instructions of your app are executed by Valgrind on this CPU and
memory/performance profiling is done there, not in your app code.  As far as
Quantify comparison, cachegrind sking gives you much more information - e.g.
Quantify does not report cache usage profile.
> Last but not least - Valgrind is free, has MUCH better developer support,
and is updated more frequently.
> Downside - only runs on x86 Linux, exactly because of synthetic CPU -
porting it to other platforms is too much work, so developers chose the most
popular platform.
>
> Regards,
> Alex

Re: [Valgrind-users] Pre-instrumented vs JIT instrumentation

From: Nicholas N. <nj...@ca...> - 2003-06-06 09:08:16

On Thu, 5 Jun 2003, Mike Bresnahan wrote:

> Perhaps I misunderstand how valgrind works.  I have the understanding that,
> using the synthetic CPU, valgrind instruments the application on the fly and
> stores the instrumented code blocks in a cache.  The author describes this
> process here http://developer.kde.org/~sewardj/docs-1.9.5/mc_techdocs.html.
>
> Do I misunderstand?

You are right, Valgrind instruments the app dynamically.

I think when Alex said Valgrind doesn't "instrument" anything, he meant
that you don't have to do anything explicit, in a separate step.

> Note that Quanitfy for Windows does not require a relink.  You just point it
> at an executable and off it goes.  First it instruments the executable and
> dependent DLLs and then it executes it.

Valgrind's approach -- dynamically compiling and instrumenting all code
every time -- is the most convenient for users.  No worrying about whether
an executable is instrumented or anything.

As for efficiency, it would be possible to have Valgrind save the code it
generates, so you could reuse it the next time.  But the dynamic
compilation + instrumentation phase typically only takes up about 10% of
the execution time for Valgrind (the Memcheck skin, at least) so doing
this wouldn't help performance much, but it would make the implementation
more complicated, and possibly make life more difficult for users.

Actually, thinking more, Valgrind's just-in-time approach could has
another efficiency advantage -- it only instruments the code that gets
executed.  This is good (especially space-wise) if you only use a small
fraction of a great big library.

N

Re: [Valgrind-users] Pre-instrumented vs JIT instrumentation

From: Mike B. <mbr...@vi...> - 2003-06-06 15:26:58

> Valgrind's approach -- dynamically compiling and instrumenting all code
> every time -- is the most convenient for users.  No worrying about whether
> an executable is instrumented or anything.

This is also true of the other approach.  The instrumented code blocks can
be kept in a disk cache which is consulted each time the executable is
executed with profiling enabled.  There's no extra step for the user.  The
user clicks on the same button regardless of whether the application was
previously instrumented.

> As for efficiency, it would be possible to have Valgrind save the code it
> generates, so you could reuse it the next time.

This is what Quantify does.

> But the dynamic
> compilation + instrumentation phase typically only takes up about 10% of
> the execution time for Valgrind (the Memcheck skin, at least) so doing
> this wouldn't help performance much, but it would make the implementation
> more complicated, and possibly make life more difficult for users.

I don't understand why it must be more difficult for the user.

Do you know what percentage of time when using the cache profiling skin?

BTW, why are they called "skins"?  It makes it sound like something
graphical.

> Actually, thinking more, Valgrind's just-in-time approach could has
> another efficiency advantage -- it only instruments the code that gets
> executed.  This is good (especially space-wise) if you only use a small
> fraction of a great big library.

I agree this is an efficiency gain if you profile the executable only once
or perhaps a handfull of times.  However the efficiency gain is whiped away
if you execute the same executable enough times.

In any case, if it's only a 10% difference in performance, I'm not going to
worry about too much.

Thanks for the responses.

Mike Bresnahan

Re: [Valgrind-users] Pre-instrumented vs JIT instrumentation

From: Nicholas N. <nj...@ca...> - 2003-06-06 16:12:32

On Fri, 6 Jun 2003, Mike Bresnahan wrote:

> > But the dynamic compilation + instrumentation phase typically only
> > takes up about 10% of the execution time for Valgrind (the Memcheck
> > skin, at least) so doing this wouldn't help performance much, but it
> > would make the implementation more complicated, and possibly make life
> > more difficult for users.
>
> I don't understand why it must be more difficult for the user.

If the instrumented version replaces the original version, they don't have
their uninstrumented version anymore.  If the instrumented version is
saved in a second file, they have extra files to deal with.  It's not
necessarily much more difficult for the user.

> Do you know what percentage of time when using the cache profiling skin?

It's hard to give an answer because it can vary quite a bit.  But, as an
example, I just tried bunzip2'ing a 600kb file.  40% of the time was spent
running the instrumented code.  57.5% of the time was spent in the cache
simulation functions.  The time spent compiling and instrumenting was less
than 0.1%.

If you want to know more, you can enable Valgrind's internal profiling by
including the line:

  #include "vg_profile.c"

in a skin (they all have it in there, just commented out).  Recompile, and
use --profile=yes and you get a breakdown of where the time was spent.

> BTW, why are they called "skins"?  It makes it sound like something
> graphical.

Because it's a short name (I didn't want to call them "instrumentation
plug-ins" or somesuch) and I couldn't think of anything better and now it
has stuck :)

> > Actually, thinking more, Valgrind's just-in-time approach could has
> > another efficiency advantage -- it only instruments the code that gets
> > executed.  This is good (especially space-wise) if you only use a small
> > fraction of a great big library.
>
> I agree this is an efficiency gain if you profile the executable only once
> or perhaps a handfull of times.  However the efficiency gain is whiped away
> if you execute the same executable enough times.

Assuming you aren't changing and recompiling the program frequently...

N