|
From: John R.
|
Julian Seward wrote:
> The usual form of 'return' is
>
> branch-to-LR ("blr")
>
> Assuming the above is correct, I think the problem is
>
> branch-to-LR could also be any old computed goto (eg, a switch)
True, just like the x86 "jmp *%ecx" is a computed goto of any kind:
C-language 'switch', return from subroutine, hard-coded state machine,
etc.
> How to distinguish this case from a return?
As compiled by gcc, the "branch to link regiser [or count register]"
for nearly all C-language 'switch' statements is preceded by a
bounds check of the switch value, and an indexed fetch from a table
of addresses or offsets. This can be analyzed by a backwards scan
of dataflow through instructions at preceding descending addresses.
(Once in a while the scan must follow labels and jumps.)
It is not particularly hard to do, but a new release of gcc may
use a different style that an old analyzer does not recognize.
The style also varies with position-independent code (-fPIC),
optimization level (-O2), complexity of nearby expressions, etc.
The C-language runtime support library glibc does have some hand code
that uses computed GOTO which does not have this bounds check, because
other code checks it implicitly; "AND immediate with 7" guarantees
0<=value<=7. In theory sometimes gcc could do a range analysis
of the index, then subsume the test, but I have not seen it.
Note that x86 has the same generic difficulty. "jmp *%ecx"
could be a switch or a return, and a 'ret' _also_ could be a switch
or a return. Both are "merely" a computed GOTO. Stylistically
nearly every 'ret' is a subroutine return, and nearly every "jmp *reg"
is not; but there are legitimate exceptions both ways.
--
John Reiser, jreiser@BitWagon.com
|