From: RocChen <sin...@gm...> - 2013-04-27 08:45:18
|
Thank you for your enthusiastic guidance, sir: I could get nice profiling results with the command list. (but with another issue I can not understand, described below) Command List: > rm -rf /var/lib/oprofile > rm -rf /root/.oprofile > opcontrol --init > opcontrol --no-vmlinux > opcontrol --setup --event=CPU_CYCLES:10000 --separate=lib,kernel > opcontrol --start --image=all > ./array > ./../mpeg2dec/oprofile_results/mpeg2dec -b ../mpeg2dec/input_base/input_base_4CIF_96bps.mpg -o3 output_base_4CIF_96bps_%03d > opcontrol --dump > opreport -l array > opreport -l ./../mpeg2dec/oprofile_results/mpeg2dec Nice Results: [root]$ opreport -l array Using /var/lib/oprofile/samples/ for samples directory. warning: /no-vmlinux could not be found. CPU: ARM Cortex-A9, speed 1998 MHz (estimated) Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count 10000 samples % image name symbol name 70547 95.8858 array slow_multiply 1181 1.6052 array fast_multiply 983 1.3361 array main 828 1.1254 no-vmlinux /no-vmlinux 29 0.0394 ld-2.13.so /lib/arm-linux-gnueabi/ld-2.13.so 6 0.0082 libc-2.13.so /lib/arm-linux-gnueabi/ libc-2.13.so [root]$ opreport -l mpeg2decode Using /var/lib/oprofile/samples/ for samples directory. warning: /no-vmlinux could not be found. CPU: ARM Cortex-A9, speed 1998 MHz (estimated) Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count 10000 samples % image name symbol name 23899 16.9694 mpeg2decode conv420to422 23648 16.7912 mpeg2decode store_ppm_tga 16695 11.8542 mpeg2decode conv422to444 16072 11.4119 mpeg2decode Decode_Picture 15934 11.3139 mpeg2decode Fast_IDCT 15133 10.7451 no-vmlinux /no-vmlinux 14614 10.3766 mpeg2decode putbyte 9260 6.5750 mpeg2decode form_component_prediction 1631 1.1581 mpeg2decode Flush_Buffer 825 0.5858 mpeg2decode Decode_MPEG2_Intra_Block 481 0.3415 mpeg2decode form_prediction.constprop.0 415 0.2947 mpeg2decode Decode_MPEG2_Non_Intra_Block 304 0.2159 mpeg2decode Get_Bits 200 0.1420 mpeg2decode Show_Bits 195 0.1385 mpeg2decode macroblock_modes ........ (*) However, there is an issue that I can hardly understand: after I repeated the commands list above several times (with the same event and sample rate, just exactly the same command sequences), it may give the different results, the total sample numbers are only several tens and no samples for user application functions. In my thought, the profiling shoud be at least similar to above nice results. Bad results: [root]$ opreport -l array Using /var/lib/oprofile/samples/ for samples directory. warning: /no-vmlinux could not be found. CPU: ARM Cortex-A9, speed 1998 MHz (estimated) Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count 10000 samples % image name symbol name 63 92.6471 no-vmlinux /no-vmlinux 5 7.3529 libc-2.13.so /lib/arm-linux-gnueabi/ libc-2.13.so (0)-(linaro-chenp)-[Sat Apr 27][15:06:36]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/workspace] [root]$ opreport -l mpeg2decode Using /var/lib/oprofile/samples/ for samples directory. warning: /no-vmlinux could not be found. CPU: ARM Cortex-A9, speed 1998 MHz (estimated) Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count 10000 samples % image name symbol name 69 95.8333 no-vmlinux /no-vmlinux 3 4.1667 libc-2.13.so /lib/arm-linux-gnueabi/ libc-2.13.so Besides, after this certain time, all following profiling with oprofile will give such kind of oprofile results. I can get the above nice profiling results again only when I reboot the system. I conducted dozen times of experiments. The repeat number of the above command sequence after which the results turn to the 'bad' kind (only tens of sample of kernel) is not regular. It just suddenly becomes such a situation after seval times of profiling. Hope I describe the problem clearly Regards On Sat, Apr 27, 2013 at 12:07 AM, Maynard Johnson <may...@us...>wrote: > On 04/26/2013 09:25 AM, RocChen wrote: > > > > Very sorry for forgetting cc to the maillist~ > > > > > > ---------- Forwarded message ---------- > > From: *RocChen* <sin...@gm... <mailto:sin...@gm...>> > > Date: Fri, Apr 26, 2013 at 10:14 PM > > Subject: Re: no sample when profiling ARM Cortex-A9 with Linux kernel 3.3 > > To: Koteswararao Nelakurthi <kne...@mv... <mailto: > kne...@mv...>> > > > > > > Hai,Koteswararao > > > > Thanks for for quick reply. > > > > Here is my profiling procedure (opcontrol: oprofile 0.9.7 compiled on > Apr 26 2013 08:47:51): > > > > [root]$ rm -rf /var/lib/oprofile/ > > (0)-(linaro)-[Fri Apr > 26][22:03:05]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe > > g2dec/oprofile_results] > > [root]$ rm -rf ~/.oprofile/ > > (0)-(linaro)-[Fri Apr > 26][22:03:13]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe > > g2dec/oprofile_results] > > [root]$ opcontrol --init > > (0)-(linaro)-[Fri Apr > 26][22:03:29]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe > > g2dec/oprofile_results] > > [root]$ opcontrol --setup --event=CPU_CYCLES:1000 --separate=all > --no-vmlinux > > The '--separate=all' option categorizes your samples by kernel, library, > thread, and CPU. The 'thread' and 'cpu' categorization is rarely needed > and just leads to confusion when trying to generate reports. Use > '--separate=lib,kernel' instead. > > CPU_CYCLES:1000 is a very high sampling rate, which is undoubtedly why you > get the "WARNING! The OProfile kernel driver reports sample buffer > overflows" message. I recommend a count of 100000 (or maybe even higher) > versus 1000. > > > (0)-(linaro)-[Fri Apr > 26][22:04:20]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe > > g2dec/oprofile_results] > > [root]$ opcontrol --start > --image=../mpeg2-oprofiling/src/mpeg2dec/mpeg2decode > > For starters, don't use the '--image' option. Please revert that with > 'opcontrol --image=all' and try again. > > -Maynard > > Using 2.6+ OProfile kernel interface. > > Using log file /var/lib/oprofile/samples/oprofiled.log > > Daemon started. > > Profiler running. > > (0)-(linaro)-[Fri Apr > 26][22:05:25]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe > > g2dec/oprofile_results] > > [root]$ time ./../mpeg2-oprofiling/src/mpeg2dec/mpeg2decode -b > input_base_4CIF_96bps.mpg -o3 output_base_4CIF_96bps_ > %03d > > saving output_base_4CIF_96bps_000.ppm > > saving output_base_4CIF_96bps_001.ppm > > saving output_base_4CIF_96bps_002.ppm > > saving output_base_4CIF_96bps_003.ppm > > saving output_base_4CIF_96bps_004.ppm > > saving output_base_4CIF_96bps_005.ppm > > saving output_base_4CIF_96bps_006.ppm > > saving output_base_4CIF_96bps_007.ppm > > saving output_base_4CIF_96bps_008.ppm > > > > real 0m1.593s > > user 0m1.490s > > sys 0m0.100s > > (0)-(linaro)-[Fri Apr > 26][22:05:54]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe > > g2dec/oprofile_results] > > [root]$ opcontrol --dump > > (0)-(linaro)-[Fri Apr > 26][22:06:06]-[.=~/Workspace/Zynq/testbench-zynq/hotcode-profiling/mediabench2_video/mpe > > g2dec/oprofile_results] > > [root]$ opreport -l ../mpeg2-oprofiling/src/mpeg2dec/mpeg2decode > > WARNING! The OProfile kernel driver reports sample buffer overflows. > > Such overflows can result in incorrect sample attribution, invalid sample > > files and other symptoms. See the oprofiled.log for details. > > You should adjust your sampling frequency to eliminate (or at least > minimize) > > these overflows. > > error: no sample files found: profile specification too strict ? > > > > ****************************************** > > I review the dmesg log for something related with the pmu: > > > >> dmesg | grep PMU > > hw perfevents: enabled with ARMv7 Cortex-A9 PMU driver, 7 counters > available > >> dmesg | grep pmu > > registering platform device 'arm-pmu' id 0 > > > > > > > > > > On Fri, Apr 26, 2013 at 9:52 PM, Koteswararao Nelakurthi < > kne...@mv... <mailto:kne...@mv...>> wrote: > > > > >>opcontrol --start --image=<application name> > > Provide binary application name . > > ex. opcontrol --start --image=array > > > > Regards > > koteswararao > > > > > > On Fri, Apr 26, 2013 at 7:17 PM, Koteswararao Nelakurthi < > kne...@mv... <mailto:kne...@mv...>> wrote: > > > > Dear RocChen, > > > > I hope i understood your situation.From the log your showed > > in previous mail, your are successfully updated the oprofile > > userland tool. > > Coming to profiling of applications, you need do it as below > > > > rm -rf /var/lib/oprofile/ > > rm -rf /root/.oprofile > > > > opcontrol --start --image=<application name> > > > > gcc -g <applicationname> -o <binary application name> > > ex: gcc -g array.c -o array > > > > ./applicationame > > ex. ./array > > > > opcontrol --dump > > > > opreport > > > > array is sample application which is simply doing some > multiplication etc. > > you can use any application that will put load over CPU so that > i can use > > H/W counter to count the samples. > > > > Light Load application will not generate Events and CPU can't > use much > > of it's time to it and hence samples might not be generated. > > > > > > Regards > > koteswararao > > > > > > > > |