|
From: Josef W. <Jos...@gm...> - 2007-06-06 15:30:55
|
On Wednesday 06 June 2007, Julian Seward wrote:
> > On Wednesday 06 June 2007 12:49, John Reiser wrote:
> > Josef Weidendorfer wrote:
> > > the PPC ISA has no explicit call/return instructions,
> > > so there have to be heuristics about what e.g. branch in a binary was
> > > meant to be a call.
>
> I don't think that's exactly true, but there are ambiguities to do
> with returns. See below.
Yes, this statement probably was quite provocative ;-)
On Wednesday 06 June 2007, John Reiser wrote:
> Julian Seward wrote:
>
> > The usual form of 'return' is
> >
> > branch-to-LR ("blr")
> >
> > Assuming the above is correct, I think the problem is
> >
> > branch-to-LR could also be any old computed goto (eg, a switch)
>
> True, just like the x86 "jmp *%ecx" is a computed goto of any kind:
> C-language 'switch', return from subroutine, hard-coded state machine,
> etc.
>
> > How to distinguish this case from a return?
The main problem for callgrind is that it only sees the intermediate
IR generated by VEX, which has annotations for the jump kind, like
"this jump is a return instruction". This information is reliable for
x86, but not for PPC, as in the example given by Julian.
Another thing is that I almost did not touch any of the heuristics
of callgrind for PPC; it is still "tuned" for x86.
It would be nice to incorporate heuristics into callgrind
which allow to reconstruct the call relations as given in the
original high level language (C/C++/Fortran/ADA/...), independent
of the ISA used for the binary.
> As compiled by gcc, the "branch to link regiser [or count register]"
> for nearly all C-language 'switch' statements is preceded by a
> bounds check of the switch value, and an indexed fetch from a table
> of addresses or offsets. This can be analyzed by a backwards scan
> of dataflow through instructions at preceding descending addresses.
> (Once in a while the scan must follow labels and jumps.)
> It is not particularly hard to do, but a new release of gcc may
> use a different style that an old analyzer does not recognize.
> The style also varies with position-independent code (-fPIC),
> optimization level (-O2), complexity of nearby expressions, etc.
>
> The C-language runtime support library glibc does have some hand code
> that uses computed GOTO which does not have this bounds check, because
> other code checks it implicitly; "AND immediate with 7" guarantees
> 0<=value<=7. In theory sometimes gcc could do a range analysis
> of the index, then subsume the test, but I have not seen it.
Perhaps the best heuristic for deciding whether a computed goto
actually is a return, is to look at debug info (same vs. different
function/ELF object) and equality of target to an expected return
address on callgrinds shadow stack.
> Note that x86 has the same generic difficulty. "jmp *%ecx"
> could be a switch or a return, and a 'ret' _also_ could be a switch
> or a return. Both are "merely" a computed GOTO. Stylistically
> nearly every 'ret' is a subroutine return, and nearly every "jmp *reg"
> is not; but there are legitimate exceptions both ways.
In practice, callgrind currently is working really well with x86
(and x86-64) code.
Using a "ret" only as subroutine return is not only a stylistic
issue: on modern x86 processors, using a return stack for branch prediction
is quite common. Ie. it gives bad performance when you use a "ret" as
a computed goto, as you confuse the branch predictor.
Josef
|