|
From: Bill R. Jr. <bru...@te...> - 2003-10-30 22:53:52
|
Hi,
I am trying to optimize cache usage in several large applications
(that I didn't write), and I'd like to measure the "cache temperature"
of various fields in the principal data structures so that I can
1. Group together fields that are "hot".
2. Perhaps move the bulk of "cold" data into separately allocated
structures.
3. Compact some fields into Fortran-style arrays and use indexing.
I am curious as to whether there is any work-in-progress on a skin that
might aid this type of analysis, or whether anyone has any particular
design advice. Both valgrind and oprofile are good at identifying
particular bits of code that cause cache misses (from which one can
identify which field is being accessed, etc.), but the analysis becomes
more difficult when cache misses are spread more uniformly throughout
the code.
Thanks,
Bill Rugolsky
|
|
From: Josef W. <Jos...@gm...> - 2003-10-31 09:18:54
|
Hi, I would like to integrate events related to data structures in my calltree skin. I recently had no time for this, but it is definitely on my TODO. Perhaps we can work together to make this a reality? The big win with cache simulation is that we have all the data addresses of memory references. With oprofile, I don't think this is available (Correct me if I'm wrong). Still, P4 and Itanium would provide this (in a statistical way) via "tagged events". Things in mind for the skin: * Reading type info, static vars from debug info to get from address to symbolic information. Jeremy already has written something like this for helgrind for STABS format. * Have a table for address -> data structure mapping. This has to be updated on mallocs/frees, and stack frame enter/leave. * Perhaps storing events even per data type offset is too much for arrays. I would relate them to address differences of successive references into the same array. This even includes some timing information, and the user should already be happy with information like e.g. "all L2 misses happen when going through the array with stride 100". * Enhance the cachegrind.out format for this kind of information. I like to have both code & data related info in the same file. * Write a command line tool to output the information. This always should be done even if we have a GUI visualisation tool, as not everybody wants GUIs. (I recently made ct_annotate in the calltree package working again). * Make some graphical visualization. IMHO, this would perfectly fit into my KCachegrind tool (With a list of data structures side by side to the list of functions). Regarding implementation: As we need to enhance the debug info reader in the skin, we have to extend the skin API for this/put code into VG core. I would prefer an even more modular approach than now in VG (to be) 2.0: Put the debug info reader (currently in VG core) into its own shared lib which can be replaced by skins. In the end, we would have plugins which provide additional functionality which can be used by other plugins or by skins (some kind of hierarchical layering). The startup script would have to check for plugin versions to decide which libs to put into LD_PRELOAD for a skin to be runnable. Josef On Thursday 30 October 2003 23:53, Bill Rugolsky Jr. wrote: > Hi, > > I am trying to optimize cache usage in several large applications > (that I didn't write), and I'd like to measure the "cache temperature" > of various fields in the principal data structures so that I can > > 1. Group together fields that are "hot". > 2. Perhaps move the bulk of "cold" data into separately allocated > structures. > 3. Compact some fields into Fortran-style arrays and use indexing. > > I am curious as to whether there is any work-in-progress on a skin that > might aid this type of analysis, or whether anyone has any particular > design advice. Both valgrind and oprofile are good at identifying > particular bits of code that cause cache misses (from which one can > identify which field is being accessed, etc.), but the analysis becomes > more difficult when cache misses are spread more uniformly throughout > the code. > > Thanks, > > Bill Rugolsky > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
|
From: Nicholas N. <nj...@ca...> - 2003-10-31 09:32:49
|
On Fri, 31 Oct 2003, Josef Weidendorfer wrote:
> * Reading type info, static vars from debug info to get from address to
> symbolic information. Jeremy already has written something like this for
> helgrind for STABS format.
I've written some DWARF2 parsing code, although it's not in the
repository.
> Regarding implementation: As we need to enhance the debug info reader
> in the skin, we have to extend the skin API for this/put code into VG core.
> I would prefer an even more modular approach than now in VG (to be) 2.0: Put
> the debug info reader (currently in VG core) into its own shared lib which
> can be replaced by skins.
> In the end, we would have plugins which provide additional functionality which
> can be used by other plugins or by skins (some kind of hierarchical
> layering). The startup script would have to check for plugin versions to
> decide which libs to put into LD_PRELOAD for a skin to be runnable.
The original idea of the core/skin split had this middle layers, for
functions that are used by some but not all skins. In the end, everything
has been thrown into the core, because it's simpler, and saves having to
decide which bits should go where, and avoids all the fiddling with
version numbers.
Perhaps a compelling case can be made for a more modular approach, but
IMHO the everything-in-core approach has worked pretty well so far.
Relevant to this: there are plans afoot for a future Valgrind to not use
the LD_PRELOAD hack, but instead do all the loading itself ("full
virtualization"). One of the many advantages of this is that Valgrind and
skins could use all the standard system libraries, including any existing
debug-info-reading libraries.
N
|
|
From: Abhijit Menon-S. <am...@wi...> - 2003-10-31 11:43:24
|
At 2003-10-31 09:32:47 +0000, nj...@ca... wrote: > > I've written some DWARF2 parsing code, although it's not in the > repository. On a tangential note, <http://www.alephnull.com/backtrace.c> may prove useful. -- ams |
|
From: Bill R. Jr. <bru...@te...> - 2003-11-03 16:19:48
|
On Fri, Oct 31, 2003 at 10:18:41AM +0100, Josef Weidendorfer wrote: > I would like to integrate events related to data structures in my > calltree skin. I recently had no time for this, but it is definitely on > my TODO. > > Perhaps we can work together to make this a reality? I too am swamped with other things; it will be a few months at least before I can devote any real time to this, though I may get a chance to play with modifying the cachegrind output format and running some data through some stat tools like R. I have long-term profiling needs that would be well-served by this type of skin, so I'll try to allocate time in my work schedule. I've seen an app with soft-realtime requirements spend 80+% of its time waiting for cache-misses; a little guesswork more than halved that overhead. Regards, Bill Rugolsky |
|
From: Nicholas N. <nj...@ca...> - 2003-11-01 11:35:39
|
On Thu, 30 Oct 2003, Bill Rugolsky Jr. wrote: > I am trying to optimize cache usage in several large applications > (that I didn't write), and I'd like to measure the "cache temperature" > of various fields in the principal data structures so that I can > > 1. Group together fields that are "hot". > 2. Perhaps move the bulk of "cold" data into separately allocated > structures. > 3. Compact some fields into Fortran-style arrays and use indexing. > > I am curious as to whether there is any work-in-progress on a skin that > might aid this type of analysis, Not that I'm aware of. > or whether anyone has any particular design advice. It doesn't sound easy :) N |