|
From: Nicholas N. <nj...@cs...> - 2005-11-01 16:26:48
|
On Sun, 30 Oct 2005, Julian Seward wrote:
>>>> - To count the number of guest instructions, I count the number of
>>>> Ist_IMark statements executed. Is this the correct approach?
>
> Yes, although you can (optionally) use a different strategy which is
> cheaper. Rather than call add_one_guest_instr each time an IMark
> is passed, make the instrumentation loop increment a counter when
> it passes an IMark. Then, either at the end of the BB or when you
> get to an Ist_Exit, call a (new) fn add_N_guest_instrs and pass it
> the counter. Then reset the counter to zero. In other words,
> call the instruction-counting function once for each piece of
> straight-line code. Cachegrind uses a similar strategy.
Why not just increment the real global counter in-line? As opposed to
incrementing a temporary counter, and periodically adding it to global
counter?
> Depending on what Nick thinks, and your hacking enthusiasm, there is
> something that would make Lackey more useful whilst still being a
> nice simple demo of how to make a tool. That is, generate counts
> for all the following events:
>
> guest instructions
> conditional branches (split into: taken, not taken)
> loads (split into: integer, FP, 64-bit SIMD, 128-bit SIMD)
> stores (split into: integer, FP, 64-bit SIMD, 128-bit SIMD)
> alu ops (split into: integer, FP, 64-bit SIMD, 128-bit SIMD)
>
> Someone on the users list asked for something like this just the
> other day (Christian Stimming, "Fast profiling in valgrind?", 25 Oct).
> Personally I think it'd be a valuable addition.
>
> Not hard to do either: for stores, examine Ist_Store, and use
> typeOfIRExpr(bb->tyenv, st->Ist.Store.data) to get the store type.
> For loads and ALU ops, you only need to look at Ist_Tmp cases
> where the Ist.Tmp.data is either Iex_Load or Iex_{Unop,Binop}.
> All statements you will ever encounter will satisfy isFlatIRStmt
> which essentially constrains them to being flat SSA-style.
I'm happy for Lackey to change. People very often ask how to get the
stream of memory accesses made by a program, I wrote "Dullard" (see
http://www.valgrind.org/downloads/variants.html?njn) to do this, but it's
based on 2.1.2 and so works with UCode.
It would be great if Lackey could give the memory accesses as well as the
info Julian suggested above, so it would serve as a much better example
tool. Ideally the different bits of functionality (getting instruction
counts, getting memory access traces) would be clearly delineated so that
people could chop out the bits they don't need easily... perhaps having
various options like --trace-mem-accesses, --do-instr-counts, etc, would
make this obvious.
As for efficiency, it might be best to keep things simple -- eg. one C
call per IMark -- but have comments that briefly describe how things might
be done more efficiently.
Nick
|