From: laurent <la...@pu...> - 2009-02-03 19:24:50
|
I don't know which particular thing I tried eventually made it work. I did mainly three things: 1) compile my executable using only static libraries (normal debug builds link modules as dynamic libraries to accelerate link times). My reasoning was that maybe it was mistaking them as kernel libraries and filtered then out due to the image filter. 2) use the oprof_start script to launch the daemon instead of relying on manual commands. One of the things that did for me is detect that the values for BUF_SIZE and CPU_BUF_SIZE were out of range. 3) I also increased the count value to a hundredth of the clock frequency, so as to produce about 100 samples per second of execution. Eventually, after much frustration and wasted time, I was able to obtain a somewhat usable report. To answer your question, my executable was running for about 40-50 seconds. This was way too much and I needed to bring that down to under 8s. After fixing a problem I was able to spot with oprofile I am now down to 15s. Way to go, but I've run out of easy options, so I'll leave it at that for now. The reason I used the --session-dir option is because I initially thought my problems were caused by a disk space issue on the /var partition - turned out to be wrong. Using oprof_start solved that as it apparently doesn't handle the --session-dir option. Btw, opgprof did not work for me: for some mysterious reason it refuses to consider samples contributed by submodules of my application which are compiled into a separate set of directories from the final binary. I tried using the -R option and expanding the full set of library path as argument to the --image-path option, to no avail. That could be the source of my other problems (see below) but I have really no clue how to work around this. Ditto for op2calltree which does not produce proper call-graph information for kcachegrind, but that's not your problem. Another thing that got in my way was the impossibility to produce reports with opreport sorted by total samples spent in self + children, as opposed to number of sample spent in self, which is the only available option. Of course it should be possible to do that through kcachegrind and gprof, but so far I have not been able to make them work with oprofile. Should be fairly easy to fix that, I think, since that information is actually available in the second column of the last line of each section of a report generated like so: opreport -% -a -s sample -c -l <path-to-my-executable> > opreport.callgraph.rpt Someday, I might even bring myself to write a perl or python script to post-process the report and sort it like I want. Anyway, enough ranting. You can close this thread as I consider this problem solved. Thanks for your help. On Tue, 2009-02-03 at 12:47 -0600, Maynard Johnson wrote: > laurent wrote: > > Hi All, > > > > I've tried hard but have not been able to report more than a single > > sample using oprofile with the call-graph option. My > > current /etc/oprofile/daemonrc looks like so: > > SESSION_DIR=/var/lib/oprofile > > CHOSEN_EVENTS_0=CPU_CLK_UNHALTED:10000000:0:1:1 > > NR_CHOSEN=1 > > SEPARATE_LIB=0 > > SEPARATE_KERNEL=0 > > SEPARATE_THREAD=0 > > SEPARATE_CPU=0 > > VMLINUX=none > > IMAGE_FILTER=<path to a debug-compiled version of the executable I want > > to profile> > > BUF_SIZE=20000000 > > CPU_BUF_SIZE=2000000 > > CALLGRAPH=200 > > XENIMAGE=none > > > > I've tried all kinds of variations of the options and action sequence > > around the following: > > opcontrol --reset --session-dir=/home/oprofile > > opcontrol --init > > opcontrol --start --verbose --session-dir=/home/oprofile-session > > <running my application> > > opcontrol --stop > > opreport -gdf --session-dir=/home/oprofile_session | op2calltree > Is your --session-dir value here a typo? It's different from what you specify > on the --reset. Do you really *need* to use something other than the default > session-dir (/var/lib/oprofile)? If not, I recommend you do not specify > --session-dir, since it can lead to confusion if you don't use it consistently > with all oprofile tools. > > kcachegrind oprof.out.unnamed > > > > I'm running: > >> uname -r > > 2.6.24-23-generic > >> opcontrol --version > > opcontrol: oprofile 0.9.3 compiled on Jan 26 2008 00:51:18 > >> /usr/lib/kde4/bin/kcachegrind --version > > Qt: 4.3.4 > > KDE: 4.0.3 (KDE 4.0.3) > > KCachegrind: 0.5.0kde > > > > under Ubuntu 8.04.1 Hardy Heron on a Compaq nc6120 laptop. > > > > What makes me think that there is only one sample in the session > > directory is the following output while running op2calltree: > >> opreport --image-path=<path-to-my-executable> --session-dir=/home/oprofile_session -gdf | op2calltree 2>&1 | more > > warning: [vdso] (tgid:24315 range:0xb7f5a000-0xb7f5b000) could not be found. > > Description: > > CPU: PIII, speed 1862 MHz (estimated) > > Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit mask of 0x00 (No unit ma > > sk) count 10000000 > > > > Events: CPU_CLK_UNHALTED > > > > > > App unnamed > > Symbol (tgid:24315 range:0xb7f5a000-0xb7f5b000) (no symbols) (Image [vdso]) > > Sample 1: 00000411 (???:0): 2 > > Symbol GetLexeme (Image <my executable>) > > Sample 1: 0827bb79 (tclParseExpr.c:1534): 1 > > Symbol HashString (Image <my executable>) > > Sample 1: 08270445 (tclLiteral.c:804): 1 > > Symbol PyCFunction_New (Image <my executable>) > > Sample 1: 0805b6c7 (methodobject.c:49): 1 > > Symbol TclExecuteByteCode (Image <my executable>) > > Sample 1: 08252148 (tclExecute.c:844): 1 > > <snip> > If I'm interpreting this output correctly, I count a total of 6 samples > attributed to your app. Please take op2calltree out of the picture and just > show us the raw opreport data. How long does your app run? How many samples > would you predict with a count value of 10 million CPU_CLK_UNHALTED events? > > -Maynard > > Generating dump for App 'unnamed'... > > > > What am I doing wrong? > > > > > > ------------------------------------------------------------------------------ > > Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) > > software. With Adobe AIR, Ajax developers can use existing skills and code to > > build responsive, highly engaging applications that combine the power of local > > resources and data with the reach of the web. Download the Adobe AIR SDK and > > Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com > > _______________________________________________ > > oprofile-list mailing list > > opr...@li... > > https://lists.sourceforge.net/lists/listinfo/oprofile-list > > Laurent. -- "Pulsic Limited, a company registered in England and Wales (company number 384068) with its registered office at 150 Park Avenue , Aztec West, Bristol BS32 4UB , England ." |