|
From: John v. S. <jc...@cs...> - 2007-05-10 12:08:47
|
WARNING: This is going to be a large bunch of text ;-)
Josef Weidendorfer wrote:
> Interesting. Callgrind does not distinguish jump kinds, and lackey only
> counts conditional jumps.
>
> However, Julian just added a simple branch prediction simulation into cachegrind
> (see SVN), and for distinguishing direct/indirect jumps, he does something
> similar as you, but does not disable VEX optimizations.
I'm really interested in this! See below why I found it required to
disable the constant propagation.
>
> What are the scenarios where disabling this is necesssary?
>
When developing this, I've used a simple testcase. I would just create
an assembler function, so I could artificially create indirect calls and
jumps. The ASM file is directly below (NASM syntax)
segment .text
global f
global g
f:
push rbp
mov rbp, rsp
mov r11, g
call (r11)
leave
ret
g:
push rbp
mov rbp, rsp
jmp b
a:
leave
ret
b:
mov r11, 0x500000
add r11, 0x123456
mov r11, a
jmp (r11)
I call valgrind on this
valgrind --tool=lackey --trace-flags=11000000 --trace-notbelow=0
--vex-guest-chase-thresh=0 ./asm 2>asm.out
The relevant part of the valgrind output:
*Conversion of f to IR (up to the call instruction):*
0x400450: pushq %rbp
------ IMark(0x400450, 1) ------
t0 = GET:I64(40)
t1 = Sub64(GET:I64(32),0x8:I64)
PUT(32) = t1
STle(t1) = t0
0x400451: movq %rsp,%rbp
------ IMark(0x400451, 3) ------
PUT(168) = 0x400451:I64
PUT(40) = GET:I64(32)
0x400454: movq $4195424, %r11
------ IMark(0x400454, 7) ------
PUT(168) = 0x400454:I64
PUT(88) = 0x400460:I64
0x40045B: call* %r11
------ IMark(0x40045B, 3) ------
PUT(168) = 0x40045B:I64
t2 = GET:I32(88)
t3 = GET:I64(88)
t4 = Sub64(GET:I64(32),0x8:I64)
PUT(32) = t4
STle(t4) = 0x40045E:I64
====== AbiHint(Sub64(t4,0x80:I64), 128) ======
goto {Call} t3
*IR of f after the first optimization (up to the call instruction):*
------ IMark(0x400450, 1) ------
t0 = GET:I64(40)
t6 = GET:I64(32)
t5 = Sub64(t6,0x8:I64)
PUT(32) = t5
STle(t5) = t0
------ IMark(0x400451, 3) ------
PUT(40) = t5
------ IMark(0x400454, 7) ------
PUT(88) = 0x400460:I64
------ IMark(0x40045B, 3) ------
PUT(168) = 0x40045B:I64
IR-NoOp
t8 = Sub64(t5,0x8:I64)
PUT(32) = t8
STle(t8) = 0x40045E:I64
t10 = Sub64(t8,0x80:I64)
====== AbiHint(t10, 128) ======
goto {Call} 0x400460:I64
So after the first optimization the the expression exprIsConst(
bbOut->next ) for this BB will be true and thus my code thinks it is a
direct call.
But I've just tried it out with a more realistic example (function
pointers in C)
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
int add4660( int x );
int higherOrder( int (*incrFunc) (int), int param );
int main( int argc, char** argv ) {
printf( "Result higherOrder : %d\n", higherOrder( &add4660, 1) );
return 0;
}
int higherOrder( int (*incrFunc) (int), int param ) {
return (*incrFunc)(param) + add4660(param);
}
int add4660( int x ) {
return x + 0x1234;
}
But then the call to (*incrFunc)(param) isn't a constant expression,
even if I enable BB-chasing, which will lead to a BB from the start of
main to the call to the dereferenced function pointer.
So basicly, it goes 'wrong' for the artificial case.
> Did you also disable BB chasing? AFAIK, chasing can remove direct jumps in the
> guest instruction stream when converting to IR. So any numbers about direct
> jumps go wrong.
>
Yes I did disable BB chasing. Basicly because I wanted all jumps at the
end of a BB, so I would only have to check that.
>
> Perhaps it is worth mentioning here that for PPC, the decision whether
> a branch instruction actually is a call or return, is only a heuristic
> in VEX.
I'm really not into other platforms than x86 or AMD64, so I didn't knew
that. Is there a fail proof way to determine if it a call in or return
in PPC?
Either way it is not really important for this tool. It is build to see
if a certain optimization in the compiler would lead to less indirect
jumps/calls.
>
> Josef
Thanks for your comments!
-- John
|