|
From: divya a. <div...@ya...> - 2006-01-25 14:48:59
|
Hi, I am a Ph. D. student and as part of my research, I augmented callgrind to do data structure level profiling i.e. the tool outputs the data structures accesses by each user function in the running program. It outputs the following information about these accesses - whether the data structure is global/stack variable/heap variable - size - for stack variable: function in whose stack the variable is located + offset from stack start - for heap variable: address + function that performed the allocation I used this tool to evaluate the amount of data transfer between different functions if they were executed on , say, different processors. I was wondering if there has been a talk about such a tool and if it would be useful to other people. Of course, the code is not release quality yet but I can work on it. Thanx, Divya __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com |
|
From: Josef W. <Jos...@gm...> - 2006-01-25 18:40:17
|
Hi Divya, On Wednesday 25 January 2006 15:48, divya arora wrote: > Hi, > I am a Ph. D. student and as part of my research, I > augmented callgrind to do data structure level > profiling i.e. the tool outputs the data structures > accesses by each user function in the running > program. Wow. What is the slowdown of your approach? Do you read debug info to get names of global/local vars? > It outputs the > following information about these accesses > - whether the data structure is global/stack > variable/heap variable > - size > - for stack variable: function in whose stack the > variable is located + > offset from stack start > - for heap variable: address + function that performed > the allocation I suppose that relation to some stack backtrace instead of only the function could be important (similar to data from massif). > I used this tool to evaluate the amount of data > transfer between different functions if they were > executed on , say, different processors. I was > wondering if there has been a talk about such a tool > and if it would be useful to other people. At least it is very interesting for cache simulation e.g. to locate data structure which generated most of L2 misses. > Of course, > the code is not release quality yet but I can work on > it. I would be very interested in it. I started something like this in src/data.c, but work stalled b/c I wanted variable names to be printed, and to group counts for same types of variables to not produce huge amounts of data. Does your tool actually work with large apps, e.g. Konqueror? Josef > Thanx, > Divya |
|
From: Julian S. <js...@ac...> - 2006-01-25 21:40:17
|
> I used this tool to evaluate the amount of data
> transfer between different functions if they were
> executed on , say, different processors. I was
> wondering if there has been a talk about such a tool
> and if it would be useful to other people. Of course,
> the code is not release quality yet but I can work on
This sounds similar to a tool discussed a couple of years
back; it came to be referred to Bandsaw. The idea is
essentially the same, except that we thought it would be valuable
for knowing the amount of data transferred even within the
same process. I think Jeremy made a partial prototype
implementation at one stage, but it went no further.
So I'd be interested to see some example results from your
tool. Can you send some? I'd be interested to see if a
tool along these lines can be made to work well, since if
it could, it would be useful.
J
----------
Some comments from long ago:
[...]
All these things gave me an idea. I can't easily show you where
you made redundant copies, but I can easily enough show you how
data flows from place to place in your program through memory.
Consider this:
int a[];
register int sum;
line A: for (i = 0; i < 10*1000*1000; i++)
a[i] = ... whatever ...
(in some other place in the code)
line B: for (i = 0; i < 10*1000*1000; i++)
sum += a[i];
Whenever data is written to memory, the tool writes into shadow memory
the program location doing the write. When data is read, the tool
knows the location doing the read, and by inspecting shadow memory it
knows who put that data there. It can therefore increment a counter
tracking the volume of data transferred across this (source, destination)
pair. For the above fragment, assuming it was executed once, it
would say
40 MB transferred from (file.c line A) to (file.c line B)
Imagine now doing this across a whole program, and having a GUI
tool like kcachegrind to show the results. The tool shows each line
of code and graphically illustrates (possibly using width of arrows
or something) the locations to which this line sent data, or received
data, and the quantity of data transferred.
In short, the tool considers memory as a channel for transferring
data between different parts of your program, and shows exactly
how you are using that (large but finite) bandwidth.
|