RE: [Valgrind-users] cpu cycles

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Mon, 28 Jul 2003, Vincent Penquerc'h wrote:

> Yes, and it might be sensible to add it either to cachegrind
> or to a variant of it, since cache misses/hits will be needed
> for this to be accurate.

Don't give too much credit to Cachegrind, what it says is only an
approximation.  For example, it doesn't even try to take into account
virtual->physical address mappings.  See
developer.kde.org/~sewardj/docs-1.9.5/cg_main.html#cg-top, section 4.12
for a full list of its known shortcomings (Nb: the one about custom
malloc() is not a problem since v1.9.6).

> Branch prediction algorithms are known (well, they were for the
> Pentium, when I last did asm stuff). Stalls have well defined
> conditions (AGIs, etc, are predictable). So it would be doable.
> After all, Vtune does (did, at least) this.

Hmm.  Doable, maybe;  but very, very difficult.  Much harder than you
might think at first.

To count cycles you need to simulate pretty much everything: the whole
pipleline, caches, TLBs, all that stuff.  The SimpleScalar project
(www.simplescalar.com) does that.

N