|
From: Avery P. <ape...@ni...> - 2004-03-22 17:27:41
|
Hi all, I just upgraded to Debian's valgrind 2.1.1-2 package, which includes massif, and I thought I would try it out while looking for a memory "leak" (actually growth of still-reachable memory usage over time) in one of my programs. Without going into too many details (since I haven't found my leak yet :)), I have these suggestions: - in the postscript output, don't chop function names so much. It's okay in C, but terrible in C++. Several of my biggest memory users are of the form "x81161ECD:MySillyClassNa", which is too vague for my purposes :) - C++ programs, particularly when compiled without optimizations, have a tendency to go through several function calls on their way to the innermost function that massif reports, yet these function calls are almost always the same (particularly in test programs designed to verify a particular class). Tracing through this list in the html or text output is rather tedious; a simple optimization that would help a lot would be to automatically skip over any level that only has one caller, merging it with the previous level. - For the above reason, the default --depth of 3 is pretty useless in C++, and probably also (in my experience) most C programs. It should probably be the same as the default for --num-callers. Speaking of which, --num-callers is suspiciously similar to --depth; maybe they should be merged. - Is there any way to output data in a format readable by kcachegrind? The spacetime values are pretty comparable to simple time values, so kcachegrind's super-excellent visualization tools seem like the perfect way to display them. (There are obviously questions of things like "the memory usage by a function at a particular point in time", but these are less critical to me, at least, than simple overall spacetime usage.) Anyway, off I go to continue looking for leaks :) Have fun, Avery |
|
From: Josef W. <Jos...@gm...> - 2004-03-22 21:06:08
|
On Monday 22 March 2004 18:27, Avery Pennarun wrote: > - For the above reason, the default --depth of 3 is pretty useless in C++, > and probably also (in my experience) most C programs. It should > probably be the same as the default for --num-callers. Speaking of which, > --num-callers is suspiciously similar to --depth; maybe they should be > merged. --num-callers is for error generation. Merging a core option with a skin/tool option confuses users. > > - Is there any way to output data in a format readable by kcachegrind? > The spacetime values are pretty comparable to simple time values, so > kcachegrind's super-excellent visualization tools seem like the perfect > way to display them. (There are obviously questions of things like > "the memory usage by a function at a particular point in time", but > these are less critical to me, at least, than simple overall spacetime > usage.) As you said: KCachegrind can only show one "consensus" of massif at once (or: from beginning to a given point in time). Another important issue: KCachegrind still isn't able to show different call contexts of a function. But this is on my TODO list... Josef |
|
From: Avery P. <ape...@ni...> - 2004-03-22 20:04:09
|
On Mon, Mar 22, 2004 at 06:54:58PM +0100, Josef Weidendorfer wrote:
> > - Is there any way to output data in a format readable by kcachegrind?
> > The spacetime values are pretty comparable to simple time values, so
> > kcachegrind's super-excellent visualization tools seem like the perfect
> > way to display them. (There are obviously questions of things like
> > "the memory usage by a function at a particular point in time", but
> > these are less critical to me, at least, than simple overall spacetime
> > usage.)
>
> As you said: KCachegrind can only show one "consensus" of massif at once
> (or: from beginning to a given point in time).
Right - but that doesn't bother me. Just the final numbers are fine for
now.
> Another important issue: KCachegrind still isn't able to show different
> call contexts of a function. But this is on my TODO list...
I'm not sure I understand. If a calls c, and b calls c, and c allocates
memory, then I think the memory should be accounted in exactly the same way
as kcachegrind accounts CPU time. That is, memory allocated by c, when
called from a, should be accounted to both c ("self") *and* a ("incl.").
Then we can draw it in *exactly* the way kcachegrind already draws all its
usage graphs. Right?
Thanks,
Avery
|
|
From: Josef W. <Jos...@gm...> - 2004-03-23 12:00:31
|
On Monday 22 March 2004 21:03, Avery Pennarun wrote:
> On Mon, Mar 22, 2004 at 06:54:58PM +0100, Josef Weidendorfer wrote:
> > Another important issue: KCachegrind still isn't able to show different
> > call contexts of a function. But this is on my TODO list...
>
> I'm not sure I understand. If a calls c, and b calls c, and c allocates
> memory, then I think the memory should be accounted in exactly the same way
> as kcachegrind accounts CPU time. That is, memory allocated by c, when
> called from a, should be accounted to both c ("self") *and* a ("incl.").
> Then we can draw it in *exactly* the way kcachegrind already draws all its
> usage graphs. Right?
Oh, yes.
I think massif provides "self costs" for allocations attributed to allocation
contexts, which includes some call chain information. In contrast,
KCachegrind needs inclusive cost provided by the tool. So the conversion
would have to calculate inclusive costs from the self cost of contexts.
So if we name the context "c called from a" c'a, we have for example
provided by massif:
Type "Bytes Allocated": for a'c: 1000, for b'c: 20000
The conversion should produce data:
Self cost of c: 21000, cost of call from a to c: 1000, from b to c: 20000.
Another thing: doesn't massif provide the difference "Allocated" -
"Deallocated"? I'm not quite sure how to handle this.
Josef
>
> Thanks,
>
> Avery
|
|
From: Avery P. <ape...@ni...> - 2004-03-23 13:54:55
|
On Tue, Mar 23, 2004 at 01:00:18PM +0100, Josef Weidendorfer wrote:
> So the conversion would have to calculate inclusive costs from the self
> cost of contexts.
Sounds easy enough...
> Another thing: doesn't massif provide the difference "Allocated" -
> "Deallocated"? I'm not quite sure how to handle this.
Well, it actually provides the continuous sum of allocated-deallocated over
the runtime of the program. But that doesn't matter: the point is it
provides just one major number per context ("the spacetime"), which we ought
to be able to view in kcachegrind like the other numbers (cache misses, CPU
cycles, etc).
Have fun,
Avery
|
|
From: Josef W. <Jos...@gm...> - 2004-03-24 12:41:41
|
On Tuesday 23 March 2004 14:54, Avery Pennarun wrote:
> On Tue, Mar 23, 2004 at 01:00:18PM +0100, Josef Weidendorfer wrote:
> > So the conversion would have to calculate inclusive costs from the self
> > cost of contexts.
>
> Sounds easy enough...
>
> > Another thing: doesn't massif provide the difference "Allocated" -
> > "Deallocated"? I'm not quite sure how to handle this.
>
> Well, it actually provides the continuous sum of allocated-deallocated over
> the runtime of the program. But that doesn't matter: the point is it
> provides just one major number per context ("the spacetime"), which we
> ought to be able to view in kcachegrind like the other numbers (cache
> misses, CPU cycles, etc).
You are right. It's only a matter of writing a conversion script from massif
data (raw/ASCII ?) to cachegrind/calltree format.
KCachegrind should be able to cope with multiple cachegrind/calltree simple
concatenated together (for multiple consensi). The format is described in a
HTML documentation in the calltree package.
Any volunteers?
Cheers,
Josef
>
> Have fun,
>
> Avery
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> Valgrind-users mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-users
|
|
From: Nicholas N. <nj...@ca...> - 2004-03-25 08:43:23
|
On Wed, 24 Mar 2004, Josef Weidendorfer wrote:
> You are right. It's only a matter of writing a conversion script from massif
> data (raw/ASCII ?) to cachegrind/calltree format.
> KCachegrind should be able to cope with multiple cachegrind/calltree simple
> concatenated together (for multiple consensi). The format is described in a
> HTML documentation in the calltree package.
> Any volunteers?
I'm not sure about this. With (K)Cachegrind, you have a figure for every
instruction in the program, and these get mapped onto individual lines.
For Massif, you don't really have a spacetime figure for every
instruction. I think you'd only be annotating lines containing a function
call that allocates memory, or that (eventually) calls a function that
allocated memory? Eg, in this (nonsense) program:
int* f(void)
{
return malloc(1000); // 11 * 1000 * t
}
int* g(void)
{
bar();
for (i = 0; i < 10; i++)
f(); // 10 * 1000 * t
return f(); // 1 * 1000 * t
}
int* h(void)
{
foo();
x = y + z;
return g(); // 11 * 1000 * t
}
(where 't' is the length of time all the blocks are in existence for).
The comments show where the annotations would go; I'm assuming h() is
only called once.
Is that right?
N
|
|
From: Josef W. <Jos...@gm...> - 2004-03-25 09:55:55
|
On Thursday 25 March 2004 09:43, Nicholas Nethercote wrote:
> On Wed, 24 Mar 2004, Josef Weidendorfer wrote:
> > You are right. It's only a matter of writing a conversion script from
> > massif data (raw/ASCII ?) to cachegrind/calltree format.
> > KCachegrind should be able to cope with multiple cachegrind/calltree
> > simple concatenated together (for multiple consensi). The format is
> > described in a HTML documentation in the calltree package.
> > Any volunteers?
>
> I'm not sure about this. With (K)Cachegrind, you have a figure for every
> instruction in the program, and these get mapped onto individual lines.
> For Massif, you don't really have a spacetime figure for every
> instruction. I think you'd only be annotating lines containing a function
> call that allocates memory, or that (eventually) calls a function that
> allocated memory? Eg, in this (nonsense) program:
Yes.
Source Annotation does not make much sense in this case.
If you have allocation contexts, you can construct some kind of call graph out
of this, and calculate inclusive costs in the conversion.
I'm not really sure if the result is worth it, but it looks easy to do.
> int* f(void)
> {
> return malloc(1000); // 11 * 1000 * t
> }
>
> int* g(void)
> {
> bar();
> for (i = 0; i < 10; i++)
> f(); // 10 * 1000 * t
Here we add: "Call to f with inclusive cost of 10 * 1000
> return f(); // 1 * 1000 * t
Here we add: "Call to f with inclusive cost of 1 * 1000
> }
>
> int* h(void)
> {
> foo();
> x = y + z;
> return g(); // 11 * 1000 * t
> }
>
>
> (where 't' is the length of time all the blocks are in existence for).
> The comments show where the annotations would go; I'm assuming h() is
> only called once.
>
> Is that right?
Yes. The numbers of the contexts would be present in inclusive costs.
I'm not sure: Would the location of the call sites be available for annotation
purpose?
On a side note:
As I managed to run OProfile on my Pentium-M notebook, I'm more interested in
a import/conversion of the sample data from OProfile.
My vision is to use call graph data from Calltree for a run of a program, and
be able to add estimations for inclusive costs to the sampling results of
OProfile for the same (deterministic) program, assuming same calls.
This would be a way to get inclusive costs for all kind of performance metrics
on all architectures where OProfile can run (AMD64, PowerPC, IA-64), with no
instrumentation, and almost no messuring overhead. Of course this assumes
that call relationship is roughly the same for the same program with
different compilers/platforms.
BTW, has somebody used OProfile to profile valgrind?
If somebody is interested, I could provide some messurements.
Josef
|