|
From: Michael E L. <me...@co...> - 2006-01-06 23:00:50
|
A tool I'm constructing needs to be able to detect returns from a
function. Detecting entry into a function is easy; I simply extended
the mechanism present in Lackey:
During each BB, copy the IR preamble, then for the rest of the
statements, switch on the tag and use VG_(get_fnname_if_entry) to see
if it is the first statement in a function. Add instrumentation
recording this fact.
The difficulty I'm having is detecting when a particular BB represents
a return from a function. After iterating through all the statements
in the BB, I examine the bb->jumpkind to see if it is Ijk_Ret. If it
is, I look at the first non-preamble statement in the BB and call
VG_(get_fnname)() using the Ist.IMark.addr of that statement as the
first argument. I add instrumentation recording the fact that this
function was returned from. The results are not what I expect (I can
forward details and a sample run if necessary, see below for the most
noticeable problem).
Is there a more straightforward way to notice a return from a function?
I'm under the impression that tools do not have access to the original
instruction stream, and thus I can't simply look for leave/ret
combinations.
Cheers,
Michael
For example, the results are often "off by one": given this program:
int a(){return 5;}
int main()
{
int x = a();
return 0;
}
my tool sees the return value of 'main' as 5 (rather than for a).
|
|
From: Julian S. <js...@ac...> - 2006-01-07 12:37:13
|
> Is there a more straightforward way to notice a return from a function?
It sounds like a reasonable first approach. Note in general getting
100% coverage is difficult due to the presence of tail calls and
the games the dynamic linker plays, but something like this should
pick up most of them.
> For example, the results are often "off by one": given this program:
>
> int a(){return 5;}
>
> int main()
> {
> int x = a();
> return 0;
> }
I suspect your tool might be missing calls rather than returns. Vex
will chase across call instructions under the right circumstances -
in other words, you might get a single block containing the start
of main and the start of a().
Try running with --vex-guest-chase-thresh=0 to disable this; does that
change the results?
Make friends with --trace-flags= and --trace-notbelow= and 'objdump -d
my_executable' so you figure out how your program is getting mangled
by vex. Useful --trace-flags= values are 10000000 and 10001000.
> I'm under the impression that tools do not have access to the original
> instruction stream, and thus I can't simply look for leave/ret
> combinations.
True. Putting the original instruction stream in would lead to people
writing architecture-dependent tools -- the point of the IR is to enable
tools to be decoupled from the underlying architecture.
J
|
|
From: Michael E L. <me...@co...> - 2006-01-07 21:04:07
|
On Sat, 7 Jan 2006, Julian Seward wrote: > 100% coverage is difficult due to the presence of tail calls and > the games the dynamic linker plays, but something like this should > pick up most of them. Yes...I think these sorts of things are actually the cause of the discrepancies I'm seeing. > I suspect your tool might be missing calls rather than returns. Vex > will chase across call instructions under the right circumstances - > in other words, you might get a single block containing the start > of main and the start of a(). Thanks for clarifying exactly what "BB chasing" was...also, thanks for the hints about trace-flags, etc. I've basically copied the machinary in Lackey for detecting calls, and for many functions, the results (i.e., the reported number of times it is called) seem to be correct. In any event, I will look at what Vex is seeing in more detail. > Try running with --vex-guest-chase-thresh=0 to disable this; does that > change the results? Yes, it does, but not for the better. Leaving that out, a function that I know is called 5 times is correctly detected 5 times by the instrumentation, but when this parameter is set as above, that value jumps to 224 :) vex-guest-chase-thresh #of times seeing foo() ------------------------------------------------- not included 5 (correct) 0 224 1 1 2 5 3 5 4 5 10 5 What is the effect of the value of vex-guest-chase-thresh ? coregrind/m_main.c indicates a range of 0 to 99 but no further info. > writing architecture-dependent tools -- the point of the IR is to enable > tools to be decoupled from the underlying architecture. Fair enough. I had seen that assertion made on the list recently and just wanted to confirm it was indeed the case. Cheers, Michael |
|
From: Nicholas N. <nj...@cs...> - 2006-01-07 18:21:19
|
On Fri, 6 Jan 2006, Michael E Locasto wrote: > A tool I'm constructing needs to be able to detect returns from a function. Tracking function entry/exit is in general very difficult, much more so than you might expect. Josef Weidendorfer's Callgrind tool does a good job, in the Valgrind source tree there's a file called docs/internals/tracking-fn-entry-exit.txt which describes the hoops it has to jump through. Nick |
|
From: Michael E L. <me...@co...> - 2006-01-07 21:12:02
|
On Sat, 7 Jan 2006, Nicholas Nethercote wrote: > Tracking function entry/exit is in general very difficult, much more so than > you might expect. Agreed. ;) > Josef Weidendorfer's Callgrind tool does a good job, in the Valgrind source > tree there's a file called docs/internals/tracking-fn-entry-exit.txt > which describes the hoops it has to jump through. Thanks for the pointer to this documenation. I had initially looked at adapting Callgrind for my purposes, but figured it would be easier to begin with one of the more basic tools and take a stab at the straightforward way first. I suppose I can revisit it. Cheers, Michael |
|
From: Julian S. <js...@ac...> - 2006-01-07 21:24:10
|
> > Try running with --vex-guest-chase-thresh=0 to disable this; does that > > change the results? > > Yes, it does, but not for the better. Leaving that out, a function that I > know is called 5 times is correctly detected 5 times by the > instrumentation, but when this parameter is set as above, that value jumps > to 224 :) > > vex-guest-chase-thresh #of times seeing foo() > ------------------------------------------------- > not included 5 (correct) > 0 224 > 1 1 > 2 5 > 3 5 > 4 5 > 10 5 > > What is the effect of the value of vex-guest-chase-thresh ? > coregrind/m_main.c indicates a range of 0 to 99 but no further info. That's strange. You should investigate. --vex-guest-chase-thresh=N (where the default N = 10, iirc) controls the extent to which vex will continue disassembling across unconditional branches and call instructions. When it sees such an instruction, it will continue disassembling into the current IR block (at the target of the insn, of course) providing that no more than N instructions have already been disassembled into this IR block. So if N=0, vex will never chase across such a branch, which means the IR really does represent a straight line piece of code. This simplifies the problem of finding calls/returns. Recent callgrinds - the one released with 3.1.0, at least (0.10.1?) forces this value to zero at startup for just such reasons. You might also want to play with --vex-guest-max-insns. Setting this to 1 will trash performance, but give you a simpler baseline scenario in which each insn is translated into its own IR block. J |
|
From: Michael E L. <me...@co...> - 2006-01-07 22:09:13
|
> this IR block. So if N=0, vex will never chase across such a branch, > which means the IR really does represent a straight line piece of > code. This simplifies the problem of finding calls/returns. Thank you for the explanation. I'm by no means ruling out errors in my instrumentation code, so this sort of analysis should help isolate and identify them. > You might also want to play with --vex-guest-max-insns. Setting > this to 1 will trash performance, but give you a simpler baseline > scenario in which each insn is translated into its own IR block. Very cool, thanks. Cheers, Michael |
|
From: Josef W. <Jos...@gm...> - 2006-01-08 00:22:13
|
On Saturday 07 January 2006 19:21, Nicholas Nethercote wrote: > On Fri, 6 Jan 2006, Michael E Locasto wrote: > > > A tool I'm constructing needs to be able to detect returns from a function. > > Tracking function entry/exit is in general very difficult, much more so > than you might expect. Josef Weidendorfer's Callgrind tool does a good > job, in the Valgrind source tree there's a file called > docs/internals/tracking-fn-entry-exit.txt which describes the hoops it has > to jump through. Hi, I think it probably would be good to extract the minimal things from callgrind for robust call tracing into a separate tool; it can not be done as a core module as it needs its own instrumentation which could conflict with a tools instrumentation needs; so a tool writer has to be aware of the instrumentation needs of call tracing. Basically, it comes down to maintain shadow call stacks for each thread and signal handler. These are updated according to calls and returns in the instruction stream, and most important: for robustness, a synchronization of the shadow stacks with the stack pointer is needed to handle longjmps (especially nasty: longjmps from signal handlers back into normal code - quite common with ADA compiler code). The most compelling reason why callgrind always (since support for VG3) is using "--vex-guest-chase-thresh=0" is the following: you can not see from the VEX code if an unconditional jump was a real call or only a jump, because it is thrown away by chaising. Julian: Can we introduce VEX hints (translating to NOOPs) for such cases, e.g. so that a tool can see what jump kind got thrown away by chasing BBs? It gets a little bit more complicated in callgrind because it does not use the pure function name on top of the shadow stack for the event counts relation, but adds another context on top of the function name, which includes the tread ID and possibly the call chain to this function (or even ignore function calls by merging with the caller). Another thing is how to interpret jumps between functions which are not allowed in high level languages, but neverless exist e.g. because of tail recursion optimization or handcrafted assembler (as in the runtime linker). Either you interpret the jump as "call", and naturally get multiple returns in a row afterwards (which is automatically handled by synchronization of the shadow stack with real stack pointer), or you simulate the jump by a return/call pair. If the common case for the SP synchronization is done inline, I can imagine that pure robust call tracing could get quite fast. The "problem" with callgrind getting faster is the further context maintaining. And I suppose this mixture makes callgrind source quite difficult to understand... Josef |
|
From: Michael E L. <me...@co...> - 2006-01-08 23:33:43
|
Josef, Thank you for the description of what the difficulties are... On Sun, 8 Jan 2006, Josef Weidendorfer wrote: > I think it probably would be good to extract the minimal things from > callgrind for robust call tracing into a separate tool; it can not I agree, and I'm looking at callgrind's source to see how I can adapt it. The tool I'm trying to build can be thought of as an strace-style tool, but for application and library functions rather than system calls. For the moment, I'm concentrating on x86 and keeping track of return values. I'd eventually like to expand monitoring to function arguments as well. > Another thing is how to interpret jumps between functions which are not allowed > in high level languages, but neverless exist e.g. because of tail recursion > optimization or handcrafted assembler (as in the runtime linker). Either > you interpret the jump as "call", and naturally get multiple returns in a row > afterwards (which is automatically handled by synchronization of the shadow > stack with real stack pointer), or you simulate the jump by a return/call pair. What about jumps that should be there but aren't? For example, I'm guessing this sort of analysis doesn't hold for functions that have been inlined as a result of compiler optimization. But then again, I don't suppose anything can detect that situation from just the instruction stream. > If the common case for the SP synchronization is done inline, I can imagine > that pure robust call tracing could get quite fast. The "problem" with It seems like it can, and I think something employing it would be useful for regression testing, among other things. > callgrind getting faster is the further context maintaining. And I suppose > this mixture makes callgrind source quite difficult to understand... I'm working my way through it ;) Cheers, Michael |
|
From: Josef W. <Jos...@gm...> - 2006-01-09 10:29:52
|
On Monday 09 January 2006 00:33, Michael E Locasto wrote: > The tool I'm trying to build can be thought of as an strace-style tool, > but for application and library functions rather than system calls. Look at the output of "callgrind -ct-verbose=1 ..". It writes a line with correct indentation on calls to functions. Returns are shown implicitly via the indentation, but it could be easily modified. > For > the moment, I'm concentrating on x86 and keeping track of return values. > I'd eventually like to expand monitoring to function arguments as well. For showing return values and function parameters (types/names), you need to write a dwarf parser for this type of debug information, or use libdwarf. The later is probably not easy, as tools should use valgrinds own libc. > > Another thing is how to interpret jumps between functions which are not allowed > > in high level languages, but neverless exist e.g. because of tail recursion > > optimization or handcrafted assembler (as in the runtime linker). Either > > you interpret the jump as "call", and naturally get multiple returns in a row > > afterwards (which is automatically handled by synchronization of the shadow > > stack with real stack pointer), or you simulate the jump by a return/call pair. > > What about jumps that should be there but aren't? That is a matter of debug info. If one instruction maps to source file:100, and the next to file:200, there is obviously a "jump" in the source code from line 100 to 200, and the compiler has linearized this. I currently do not check for this in callgrind, which is the reason that jump arrow annotation for source currently is not really useful. > For example, I'm > guessing this sort of analysis doesn't hold for functions that have > been inlined as a result of compiler optimization. But then again, I don't > suppose anything can detect that situation from just the instruction > stream. The compiler is allowed to do everything. Inlining simply can not be shown as calling a function, because instructions can be reordered, duplicated etc. The inlined function has to be seen as part of the calling function, and perhaps the only useful thing here is to shown jumps among source lines. There can be a huge gap between source and assembler. E.g. look at OpenMP code generated by the Intel compiler. You will see a totally different control flow, and the compiler generates new functions for parallel regions. Josef |
|
From: Nicholas N. <nj...@cs...> - 2006-01-09 16:40:00
|
On Mon, 9 Jan 2006, Josef Weidendorfer wrote: >> What about jumps that should be there but aren't? > > That is a matter of debug info. If one instruction maps to source > file:100, and the next to file:200, there is obviously a "jump" in the > source code from line 100 to 200, and the compiler has linearized this. That's a reasonable heuristic, but it won't always be true; there could just be a big comment. > The compiler is allowed to do everything. Inlining simply can not be > shown as calling a function, because instructions can be reordered, > duplicated etc. The inlined function has to be seen as part of the > calling function, and perhaps the only useful thing here is to shown > jumps among source lines. > > There can be a huge gap between source and assembler. E.g. look at > OpenMP code generated by the Intel compiler. You will see a totally > different control flow, and the compiler generates new functions > for parallel regions. Yes, it's a hard problem. Nick |
|
From: Josef W. <Jos...@gm...> - 2006-01-09 16:59:06
|
On Monday 09 January 2006 17:39, you wrote: > On Mon, 9 Jan 2006, Josef Weidendorfer wrote: > > >> What about jumps that should be there but aren't? > > > > That is a matter of debug info. If one instruction maps to source > > file:100, and the next to file:200, there is obviously a "jump" in the > > source code from line 100 to 200, and the compiler has linearized this. > > That's a reasonable heuristic, but it won't always be true; there could > just be a big comment. Right. This reminds me of a wish list for the tool API: Iterator interfaces for debug info: eg. function symbols defined in a segment and source lines of functions. This would allow to get all the source lines which are attributed to instructions, even for code which is never executed. This would ease above problem, and would be good for a code coverage module. Or is there currently a way for a tool to get the source line of instructions which were never executed? > > There can be a huge gap between source and assembler. E.g. look at > > OpenMP code generated by the Intel compiler. You will see a totally > > different control flow, and the compiler generates new functions > > for parallel regions. > > Yes, it's a hard problem. Yes. It would be nice for a visualization tool to be able to show the mapping from source to assembler in a meaningful, understandable way. But everything beyound using the debug line info seems to be problematic. I have no problem to show the exact control flow in the assembler annotation of KCachegrind (via jump arrows), but mapping this to jumps in the source is difficult. Currently, KCachegrind does something alike, but this is just confusing. And that is the reason that "--collect-jumps=yes" is an option in callgrind. Perhaps the best is to try to detect loop nests, and see if this can be mapped better to source... Josef > > Nick > > |