|
From: Geoff S. <gs...@us...> - 2006-02-25 01:03:11
|
I took a look at the valgrind --tool=lackey --trace-mem=yes output. With the trace-mem implementation, adding instruction tracing here seems to make sense. So I'd like to propose two alternatives. Alternative A. A separate itrace tool, as I proposed in http://sourceforge.net/mailarchive/message.php?msg_id=14760370 Alternative B. Dovetailing itrace into lackey as follows. The biggest impact to lackey would be adding the option --trace-instr=, and changing the output format to the record-oriented one I proposed before. This would reduce volume of output while keeping postprocessing easy. I would change chapter 9 of the user manual as follows: ------------------------------------------------------ 9. LACKEY, a very simple profiler and trace tool To use this tool, specify --tool=lackey on the Valgrind command line. Lackey is a simple Valgrind tool that provides a simple trace capability and some basic program measurement. It adds quite a lot of simple instrumentation to the program's code. It is primarily intended to be of use as an example tool. _Tracing Output_ Tracing output (if specified) is generated during program execution. Lackey can provide a trace of each instruction that is executed in your program, or a trace of memory accesses, or both by showing memory accesses associated with each instruction. _Profiling Output_ Lackey reports profiling information at the conclusion of program execution. The report includes: o The number of calls to the function specified by --fnname switch o The number of conditional branches encountered and the number and proportion of those taken. o Statistics about the amount of work done during the execution of the client program: o The number of basic blocks entered and completed by the program. Note that due to optimisations done by the JIT, this is not really an accurate value. o The number of guest (x86, amd64, ppc, etc.) instructions and IR statements executed. IR is Valgrind's RISC-like intermediate representation via which all instrumentation is done. o Ratios between some of these counts. 9.2 Command-line options specific to lackey --fnname=<name> Specifies the function to profile (and trace). The default is _dl_runtime_resolve(), the function in glibc's dynamic linker that resolves function references to shared objects. --trace-extent=all,function,calltree Specifies whether to trace the entire program, the instructions in the function specified by --fnname, or all instructions from entry into the function until the function returns. The default is calltree. --trace-instrs=yes Enables tracing of instructions. --trace-mem=yes Enables tracing of memory accesses. --detailed-counts=yes Profiling output includes a table with counts of loads, stores and ALU operations for various types of operands. The types are identified by their IR name ("I1" ... "I128", "F32", "F64", and "V128"). Also prints the exit code of the client program. 9.3 Trace Output The trace output consists of an ASCII-formatted trace of the instructions executed. The format is very simple to facilitate post-processing. The first character of each line ("record") indicates what kind of data is included in the line. Trace output is provided only if enabled. --trace-instr enables H, I, J, and G records. --trace-mem=yes enables H, R, and W records. Example ("..." indicates skipped records) ==14778== valgrind-itrace, Instruction and memory tracer. ==14778== Copyright (C) 2005, and GNU GPL'd. ==14778== Using LibVEX rev 1471, a library for dynamic binary translation. ==14778== Copyright (C) 2004-2005, and GNU GPL'd, by OpenWorks LLP. ==14778== Using valgrind-3.1.0, a dynamic binary instrumentation framework. ==14778== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al. ==14778== For more details, rerun with: -v ==14778== H valgrind-itrace J 00244C67 55 ; dl_start ... J 08048394 55 ; main W BE8619A8 BE861A08 I 89E5 I 83EC08 I 83E4F0 I B800000000 I 83C00F I 83C00F I C1E804 I C1E004 I 29C4 I E8C7FFFFFF W BE86198C 080483B5 G J 0804837C 55 ; print W BE861988 BE8619A8 I 89E5 ... Record definitions H whatever ... Indicates the start of the trace, possibly with additional data in no specific format. J aaaa xxxx [; symbol] Instruction-with-address record. aaaa is the address of the instruction, xxxx is a byte dump of the instruction itself. All values are in hex. Optionally, a symbol name associated with the address may be provided. Note that "J" does not imply that a branch occurred, it merely indicates that the record includes the address of the instruction executed. I xxxx Instruction record for an instruction that immediately follows the previous instruction. The address can be determined from the length of the preceding instruction. G Indicates a gap in the trace. This will happen, for instance, when valgrind simulates a system call via an int 80 (on x86) or sc (on ppc) instruction, or when a branch occurs to code not being traced. R aaaaaaaa rrrr W aaaaaaaa wwwwwwww Indicates that the previous instruction caused a read of the bytes rrrr at the address, or a write of bytes wwww. The length of the read or write is indicated by the number of bytes shown. R and W records occur in logical order; for instance, an increment-memory instruction will show the read followed by the write. 9.4 Limitations Lackey runs quite slowly, especially when --detailed-counts=yes is specified. It could be made to run a lot faster by doing a slightly more sophisticated job of the instrumentation, but that would undermine its role as a simple example tool. Hence we have chosen not to do so. Memory tracing cannot catch every load and store access. See section 3.3.7 of Nicholas Nethercote's PhD dissertation "Dynamic Binary Analysis and Instrumentation", 2004, for details about the few loads and stores that is misses, and other caveats about the accuracy of the memory trace. When --trace-extent=calltree is specified, the tracing will stop only when the function returns to its caller. Tracing will not stop if the function does not return to its caller. Some examples of this are raising C++ or Ada exceptions and longjump. Also, signals and task switching may cause unexpected code to be included in the trace. |