|
From: Balaji Rs I. <bri...@un...> - 2003-12-08 19:45:20
|
hi,
I am trying to seggregate the percentage of hits and misses attributable
to a particular address range. To be precise I am running some bench marks
on a Java Virtual machine, for e.g.:
valgrind --skin=cachegrind --I1=8192,8,64 --D1=8192,8,64 --L2=262144,8,64 rvm -verbose
I modified the cachesim_##L##_doref function and added a data structure
to get the miss and hit counts to specific areas in the heap.
The range of addresses I am interested in, is the heap space maintained by
the JavaVM. A -v command line flag in the VM tells me these address ranges
and I use these to do the conditional checks in the cachesim_##L##_doref
function followed by the counter increments in case of misses and hits.
However from the counter values and the out file generated it is
apparent that none of the addresses passed to cache-grind are in this
range. This is not possible since these addresses are being accessed by
the Java VM. This leads me to think that this is because of the address
translation being done by valgrind.
My questions are:
1. Could you explain where this address translation is done. I looked into
the detailed tech notes and went through the code in vg_memory.c and I
have some understanding but I am not clear.
2. Given an address range A-B, is there a way to determine the range to
which this will be translated by valgrind.
Any additional pointers will be greatly appreciated.
thanks
best regards
-balaji.
On Mon, 3 Nov 2003, Nicholas Nethercote wrote:
> On Mon, 3 Nov 2003, Balaji Iyengar wrote:
>
> > i am trying to modify cachgrind so it tells me the percentage of hits
and
> > misses attributable to a particular address range. I went through the
code
> > in cg_sim_gen.c and cg_main.c and have the following questions.
> >
> > 1. #define CACHESIM(L, MISS_TREATMENT)
> > this macro is defined in cg_sim_gen.c and I am not able to
determine
> > where it is used
>
> [~/grind/head5/cachegrind] grep CACHESIM *.c
> cg_sim_D1.c:CACHESIM(D1, { (*m1)++; cachesim_L2_doref(a, size, m1, m2);
} );
> cg_sim_gen.c:#define CACHESIM(L, MISS_TREATMENT)
\
> cg_sim_I1.c:CACHESIM(I1, { (*m1)++; cachesim_L2_doref(a, size, m1, m2);
} );
> cg_sim_L2.c:CACHESIM(L2, (*m2)++ );
>
>
> > 2. the function: void cachesim_##L##_doref(Addr a, UChar size,
ULong* m1,
ULong *m2)
> > is defined as part of the above macro, how is the ##L## translated to
_I1_ and _D1_
>
> As Tom said, it's macro concatenation.
>
> > 3. If I have to make cachegrind except more command line arguments
such as
> > the starting address and range, wher would i make these changes.
>
> Look for variables named "clo_*" and "VG_(clo_*)" in the skins (eg.
> Cachegrind, Memcheck) for example usage. They're fairly
straightforward.
>
> > 4. Also what would help is a general description of the sequence in
which the
> > functions are called, because at this point I am totally at bay and
dont
> > quite understand what is happening where in the code.
>
> Which functions?
>
> First of all, have you read the skin-writing guide? (Section 7 of the
> manual).
>
> Once you've done that, I would try modifying Cachegrind in very small
> steps. Start by just inserting printf's in strategic places to work out
> when different things occur.
>
> However, I think for what you want, all you need is to adjust the
> cachesim_##L##_doref function -- because you don't want to change how
the
> addresses accessed are passed to the cache simulation, but rather what's
> done with those addresses within the simulation. You don't need to know
> how Cachegrind is finding those addresses. But I can understand that
you
> might want to know how it does it.
>
> > 5. Could somebody highlight a brief set of steps that is required to
do this
> > sort of thing (seggregate cache misses/hits depending on the address
space in
> > the memory )
>
> It seems there's two parts to your problem. The first is getting the
> stream of address reads and writes. The second is using them to
simulate
> the cache. Segregating the memory falls under the second, and shouldn't
> be hard -- instead of having single global counters for hit and miss
> counts, have a series of them, one per address range.
>
> Hope this helps.
>
> N
>
|
|
From: Jeremy F. <je...@go...> - 2003-12-08 21:49:10
|
On Mon, 2003-12-08 at 11:45, Balaji Rs Iyengar wrote: > 1. Could you explain where this address translation is done. I looked into > the detailed tech notes and went through the code in vg_memory.c and I > have some understanding but I am not clear. Valgrind does no data address translation. Memory at location X in your client code will be at address X in the process itself. There is code address translation, of course, but I assume you're talking about data. J |
|
From: Jeremy F. <je...@go...> - 2003-12-09 01:06:33
|
On Mon, 2003-12-08 at 13:55, Balaji Rs Iyengar wrote: > hello, > > On Mon, 8 Dec 2003, Jeremy Fitzhardinge wrote: > > > On Mon, 2003-12-08 at 11:45, Balaji Rs Iyengar wrote: > > > 1. Could you explain where this address translation is done. I looked into > > > the detailed tech notes and went through the code in vg_memory.c and I > > > have some understanding but I am not clear. > > > > Valgrind does no data address translation. Memory at location X in your > > client code will be at address X in the process itself. There is code > > address translation, of course, but I assume you're talking about data. > > > I am referring to code addresses. Sorry, I misunderstood. You want to look at the stuff in vg_transtab.c, which manages the translation table. There's only a true mapping from code addresses -> translated addresses at the start of each basic block. That said, cachesim should operate entirely in terms of your original code addresses rather than the translated addresses. J |
|
From: Balaji I. <bri...@nc...> - 2003-12-10 03:21:57
|
hi, I am trying to run some bench marks on a Java VM, with the Java VM being run on valgrind (cachegrind). The cache statistics I get are exactly the same for different benchmarks. This is the command I run: #> valgrind --skin=cachegrind JVM bench_mark These benchmarks are significantly different in behaviour. Does this mean that cachegrind takes into account the execution of the JVM only and doesnt take into account the execution of the benchmark by the JVM. thanks -Balaji. |