From: David S. <ds...@al...> - 2002-07-10 17:59:10
|
John Levon wrote: > > On Wed, Jul 10, 2002 at 12:48:44PM +0200, David wrote: > > > Do you have any plans to add profiling of functions in the call > > stack for things like "total time in function + children" or call > > graph information? > > If you have an efficient way of storing call chains, and a reasonably > robust way of getting this working in a system where some code may be > missing frame pointers, please tell ... > I'm afraid I don't. But then, I havn't written an advanced profiler myself... I thought you might :-) On the other hand, if it *can* be made to work by compiling relevant parts of the system with frame pointers, then it might still be usable if there is a reliable and quick way to assess the quality of the backtraces. Then the program could give a message like "Sorry, 59% of backtraces failed, no call graphs for you!" if it doesn't work. But I'm speculating... I don't really know how to check that. > > I tried doing this for myself, and managed a small hack for the > > "total time..." feature. It _seemed_ to work well (using rtc > > and a whole system compiled with frame pointers). > > What changes did you make ? > Well, hope you don't expect anything special... this is about it. And it absolutely requires frame pointers! In module.c: void regparm3 op_do_profile_0(uint cpu, struct pt_regs *regs, int ctr) { /* this is the original op_do_profile */ } struct frame { unsigned long ebp; /* next frame */ unsigned long ret; /* return address */ }; /* new op_do_profile with backtrace */ void regparm3 op_do_profile(uint cpu, struct pt_regs *regs, int ctr) { unsigned long old_eip = regs->eip; unsigned long old_ebp = regs->ebp; struct frame *frame = (struct frame *)regs->ebp; /* time in the function itself = ctr 0 */ /* time in the function + children = ctr 1 */ op_do_profile_0(cpu, regs, 0); op_do_profile_0(cpu, regs, 1); while ( frame ) { if ( get_user(regs->eip, &(frame->ret)) ) break; if ( get_user(regs->ebp, &(frame->ebp)) ) break; op_do_profile_0(cpu, regs, 1); frame = (struct frame *)frame->ebp; } regs->eip = old_eip; regs->ebp = old_ebp; } Nothing surprising there. What *did* surprise me was that when I ran a small test program the results were correct.:) Then I tried profiling a "real" program (KDE/konqueror) and, while I can't really check those numbers, they still looked reasonable. So my thought was, could it really be this easy? I don't know, I'm very much an amature in this area. Is it possible that this could be turned into something usable? /David |