|
From: Julian S. <js...@ac...> - 2009-03-01 11:36:46
|
> Does anyone know if such a function, that returns the block between two > addresses exists? I think what you want to do isn't doable, at least it isn't the way you've been looking at it so far. There are two "times" that are important here: when the block is translated (translation-time) and when it is run (run-time). In order to do a minimal partitioning at translation time, you need to know all the branch targets in the program, and that's just not doable. Not only from a practical perspective but also because it's undecidable. IMO you should forget about trying to avoid the tail duplication at translation time; just let the translator do its thing. So that means you're restricted to trying to deduce the block boundaries from the run-time behaviour. Suppose, at run time, you observe a jump/call/return (any control flow transfer which isn't simply "fall through to the next instruction") from instruction A to instruction B. Then you can deduce that that A is the end of a basic block, and B is the start of one. As the program runs more and more, and you see more A->B transitions (for different A and/or B) you gradually partition the initially unknown code space into smaller and smaller blocks. Of course that only gives you the blocks observed for the specific paths that that run of the program takes. In order to find all the bbs statically (at translation time) you would need to know all the branch targets statically. But that's not something that any static analysis can guarantee to do, because of decidability limitations. So that means you're restricted to finding the block boundaries that are evident at run-time. Does that make any sense? J |