From: Sabra G. <sab...@ya...> - 2014-09-02 18:23:14
|
Hi, I would like to profile my system when running my own application , so for that purpose I used Oprofile and also Perf tool. I run separately Oprofile and Perf with the same use case (same application and same duration). When using Oprofile, I configured opcontrol with --event=CPU_ CYCLES:100000:0:1:1 and for perf , I used -c 100000 to be sure that both tools are sampling with the same event counter. When generating results for each tool, I noticed a difference in the percentages. Below are results generated with Perf: # Events: 362K cycles # # Overhead Samples Command # ........ .......... ............... # 31.40% 113991 kthreadd 31.12% 112975 swapper 11.96% 43418 dvbtest 7.31% 26539 SE-Aud-Mixer 4.34% 15750 kworker/0:1 3.33% 12070 klogd 3.31% 12033 syslogd 2.57% 9327 irq/140-vsync0 1.12% 4058 perf 0.90% 3258 kworker/1:1 0.55% 1990 irq/141-vsync1 0.39% 1414 INF-EvtAsyncCb 0.35% 1270 rcu_preempt 0.31% 1108 VIB-Hpd/0 0.21% 768 ksoftirqd/0 0.19% 688 sshd 0.14% 498 udevd 0.11% 416 pkill 0.11% 403 kworker/u4:0 0.04% 138 INP-FE-IP1 0.04% 136 INP-FE-IP0 0.03% 127 INP-FE-IP5 0.03% 121 INP-FE-IP2 0.03% 118 INP-FE-IP3 0.03% 107 ICS-Watchdog 0.03% 106 INP-FE-IP4 0.03% 101 VIB-HdmiRxMonit 0.01% 40 init 0.00% 11 ksoftirqd/1 0.00% 8 dtach 0.00% 6 rpcbind 0.00% 2 ICS-Nsrv 0.00% 2 :36 0.00% 1 ICS-Admin and here after results generated with Oprofile : CPU_CYCLES:100000| samples| %| ------------------ 1492238 76.8928 vmlinux 230774 11.8914 libc-2.14.1.so 107878 5.5588 dvbtest 55578 2.8639 player2 16198 0.8347 oprofiled 15934 0.8211 mme 7958 0.4101 stmcore_display_stiH407 3226 0.1662 ics 1975 0.1018 libpthread-2.14.1.so 1547 0.0797 sth264pp 1155 0.0595 ksound 1107 0.0570 libcrypto.so.1.0.0 899 0.0463 stm_event 832 0.0429 syslogd 466 0.0240 klogd 445 0.0229 ld-2.14.1.so 428 0.0221 strelayfs 416 0.0214 bash 378 0.0195 stm_wrapper 238 0.0123 stlinuxtv 206 0.0106 stm_registry 182 0.0094 stm_pixel_capture 140 0.0072 displaylink 131 0.0068 stm_memsrcsink 101 0.0052 stm_fe_ip 52 0.0027 sshd 37 0.0019 vibe_os 24 0.0012 hdmirx_stiH407 22 0.0011 libprocps.so.1.1.0 22 0.0011 osdev_abs 20 0.0010 ophelp 15 7.7e-04 gawk 15 7.7e-04 udevd 6 3.1e-04 libnss_files-2.14.1.so 4 2.1e-04 pkill 3 1.5e-04 grep 3 1.5e-04 sleep 3 1.5e-04 libm-2.14.1.so 3 1.5e-04 init 2 1.0e-04 libgcc_s-4.8.2.so.1 2 1.0e-04 libresolv-2.14.1.so 2 1.0e-04 libpopt.so.0.0.0 2 1.0e-04 libtirpc.so.1.0.10 1 5.2e-05 rm 1 5.2e-05 seq 1 5.2e-05 libbfd-2.23.2.so 1 5.2e-05 libgmp.so.10.1.3 1 5.2e-05 libwrap.so.0.7.6 My focus is on "dvbtest" application. As you see in the results, Oprofile shows 5.558% against 11.96% with Perf. Why results are so different? PS: I used Oprofile 0.97 version, because it's the version available in the environment where I have to test and could not use the versions later. Regards, |
From: Sabra G. <sab...@ya...> - 2014-09-02 18:46:23
|
Hi, I would like to profile my system when running my own application , so for that purpose I used Oprofile and also Perf tool. I run separately Oprofile and Perf with the same use case (same application and same duration). When using Oprofile, I configured opcontrol with --event=CPU_ CYCLES:100000:0:1:1 and for perf , I used -c 100000 to be sure that both tools are sampling with the same event counter. When generating results for each tool, I noticed a difference in the percentages. Below are results generated with Perf: # Events: 362K cycles # # Overhead Samples Command # ........ .......... ............... # 31.40% 113991 kthreadd 31.12% 112975 swapper 11.96% 43418 dvbtest 7.31% 26539 SE-Aud-Mixer 4.34% 15750 kworker/0:1 3.33% 12070 klogd 3.31% 12033 syslogd 2.57% 9327 irq/140-vsync0 1.12% 4058 perf 0.90% 3258 kworker/1:1 0.55% 1990 irq/141-vsync1 0.39% 1414 INF-EvtAsyncCb 0.35% 1270 rcu_preempt 0.31% 1108 VIB-Hpd/0 0.21% 768 ksoftirqd/0 0.19% 688 sshd 0.14% 498 udevd 0.11% 416 pkill 0.11% 403 kworker/u4:0 0.04% 138 INP-FE-IP1 0.04% 136 INP-FE-IP0 0.03% 127 INP-FE-IP5 0.03% 121 INP-FE-IP2 0.03% 118 INP-FE-IP3 0.03% 107 ICS-Watchdog 0.03% 106 INP-FE-IP4 0.03% 101 VIB-HdmiRxMonit 0.01% 40 init 0.00% 11 ksoftirqd/1 0.00% 8 dtach 0.00% 6 rpcbind 0.00% 2 ICS-Nsrv 0.00% 2 :36 0.00% 1 ICS-Admin and here after results generated with Oprofile : CPU_CYCLES:100000| samples| %| ------------------ 1492238 76.8928 vmlinux 230774 11.8914 libc-2.14.1.so 107878 5.5588 dvbtest 55578 2.8639 player2 16198 0.8347 oprofiled 15934 0.8211 mme 7958 0.4101 stmcore_display_stiH407 3226 0.1662 ics 1975 0.1018 libpthread-2.14.1.so 1547 0.0797 sth264pp 1155 0.0595 ksound 1107 0.0570 libcrypto.so.1.0.0 899 0.0463 stm_event 832 0.0429 syslogd 466 0.0240 klogd 445 0.0229 ld-2.14.1.so 428 0.0221 strelayfs 416 0.0214 bash 378 0.0195 stm_wrapper 238 0.0123 stlinuxtv 206 0.0106 stm_registry 182 0.0094 stm_pixel_capture 140 0.0072 displaylink 131 0.0068 stm_memsrcsink 101 0.0052 stm_fe_ip 52 0.0027 sshd 37 0.0019 vibe_os 24 0.0012 hdmirx_stiH407 22 0.0011 libprocps.so.1.1.0 22 0.0011 osdev_abs 20 0.0010 ophelp 15 7.7e-04 gawk 15 7.7e-04 udevd 6 3.1e-04 libnss_files-2.14.1.so 4 2.1e-04 pkill 3 1.5e-04 grep 3 1.5e-04 sleep 3 1.5e-04 libm-2.14.1.so 3 1.5e-04 init 2 1.0e-04 libgcc_s-4.8.2.so.1 2 1.0e-04 libresolv-2.14.1.so 2 1.0e-04 libpopt.so.0.0.0 2 1.0e-04 libtirpc.so.1.0.10 1 5.2e-05 rm 1 5.2e-05 seq 1 5.2e-05 libbfd-2.23.2.so 1 5.2e-05 libgmp.so.10.1.3 1 5.2e-05 libwrap.so.0.7.6 My focus is on "dvbtest" application. As you see in the results, Oprofile shows 5.558% against 11.96% with Perf. Why results are so different? PS: I used Oprofile 0.97 version, because it's the version available in the environment where I have to test and could not use the versions later. Regards, |
From: Sabra G. <sab...@ya...> - 2014-09-02 18:40:45
|
Hi, I would like to profile my system when running my own application , so for that purpose I used Oprofile and also Perf tool. I run separately Oprofile and Perf with the same use case (same application and same duration). When using Oprofile, I configured opcontrol with --event=CPU_ CYCLES:100000:0:1:1 and for perf , I used -c 100000 to be sure that both tools are sampling with the same event counter. When generating results for each tool, I noticed a difference in the percentages. Below are results generated with Perf: # Events: 362K cycles # # Overhead Samples Command # ........ .......... ............... # 31.40% 113991 kthreadd 31.12% 112975 swapper 11.96% 43418 dvbtest 7.31% 26539 SE-Aud-Mixer 4.34% 15750 kworker/0:1 3.33% 12070 klogd 3.31% 12033 syslogd 2.57% 9327 irq/140-vsync0 1.12% 4058 perf 0.90% 3258 kworker/1:1 0.55% 1990 irq/141-vsync1 0.39% 1414 INF-EvtAsyncCb 0.35% 1270 rcu_preempt 0.31% 1108 VIB-Hpd/0 0.21% 768 ksoftirqd/0 0.19% 688 sshd 0.14% 498 udevd 0.11% 416 pkill 0.11% 403 kworker/u4:0 0.04% 138 INP-FE-IP1 0.04% 136 INP-FE-IP0 0.03% 127 INP-FE-IP5 0.03% 121 INP-FE-IP2 0.03% 118 INP-FE-IP3 0.03% 107 ICS-Watchdog 0.03% 106 INP-FE-IP4 0.03% 101 VIB-HdmiRxMonit 0.01% 40 init 0.00% 11 ksoftirqd/1 0.00% 8 dtach 0.00% 6 rpcbind 0.00% 2 ICS-Nsrv 0.00% 2 :36 0.00% 1 ICS-Admin and here after results generated with Oprofile : CPU_CYCLES:100000| samples| %| ------------------ 1492238 76.8928 vmlinux 230774 11.8914 libc-2.14.1.so 107878 5.5588 dvbtest 55578 2.8639 player2 16198 0.8347 oprofiled 15934 0.8211 mme 7958 0.4101 stmcore_display_stiH407 3226 0.1662 ics 1975 0.1018 libpthread-2.14.1.so 1547 0.0797 sth264pp 1155 0.0595 ksound 1107 0.0570 libcrypto.so.1.0.0 899 0.0463 stm_event 832 0.0429 syslogd 466 0.0240 klogd 445 0.0229 ld-2.14.1.so 428 0.0221 strelayfs 416 0.0214 bash 378 0.0195 stm_wrapper 238 0.0123 stlinuxtv 206 0.0106 stm_registry 182 0.0094 stm_pixel_capture 140 0.0072 displaylink 131 0.0068 stm_memsrcsink 101 0.0052 stm_fe_ip 52 0.0027 sshd 37 0.0019 vibe_os 24 0.0012 hdmirx_stiH407 22 0.0011 libprocps.so.1.1.0 22 0.0011 osdev_abs 20 0.0010 ophelp 15 7.7e-04 gawk 15 7.7e-04 udevd 6 3.1e-04 libnss_files-2.14.1.so 4 2.1e-04 pkill 3 1.5e-04 grep 3 1.5e-04 sleep 3 1.5e-04 libm-2.14.1.so 3 1.5e-04 init 2 1.0e-04 libgcc_s-4.8.2.so.1 2 1.0e-04 libresolv-2.14.1.so 2 1.0e-04 libpopt.so.0.0.0 2 1.0e-04 libtirpc.so.1.0.10 1 5.2e-05 rm 1 5.2e-05 seq 1 5.2e-05 libbfd-2.23.2.so 1 5.2e-05 libgmp.so.10.1.3 1 5.2e-05 libwrap.so.0.7.6 My focus is on "dvbtest" application. As you see in the results, Oprofile shows 5.558% against 11.96% with Perf. Why results are so different? PS: I used Oprofile 0.97 version, because it's the version available in the environment where I have to test and could not use the versions later. Regards, |
From: Maucci, C. <cyr...@hp...> - 2014-09-08 17:49:19
|
Hi, One generally wants to look at an application performance when the system does not swap. In your case, your system seems to be suffering from swapping. Fix that issue then look at your application performance. Tune it if needed. As long as you suffer from swapping, don’t waste your time trying to improve its performance by saving CPU cycles from your app. My few cents ++Cyrille From: Sabra Gargouri [mailto:sab...@ya...] Sent: Tuesday, September 02, 2014 8:41 PM To: opr...@li... Subject: Difference between Oprofile results and Perf results running the same use case Hi, I would like to profile my system when running my own application , so for that purpose I used Oprofile and also Perf tool. I run separately Oprofile and Perf with the same use case (same application and same duration). When using Oprofile, I configured opcontrol with --event=CPU_ CYCLES:100000:0:1:1 and for perf , I used -c 100000 to be sure that both tools are sampling with the same event counter. When generating results for each tool, I noticed a difference in the percentages. Below are results generated with Perf: # Events: 362K cycles # # Overhead Samples Command # ........ .......... ............... # 31.40% 113991 kthreadd 31.12% 112975 swapper 11.96% 43418 dvbtest 7.31% 26539 SE-Aud-Mixer 4.34% 15750 kworker/0:1 3.33% 12070 klogd 3.31% 12033 syslogd 2.57% 9327 irq/140-vsync0 1.12% 4058 perf 0.90% 3258 kworker/1:1 0.55% 1990 irq/141-vsync1 0.39% 1414 INF-EvtAsyncCb 0.35% 1270 rcu_preempt 0.31% 1108 VIB-Hpd/0 0.21% 768 ksoftirqd/0 0.19% 688 sshd 0.14% 498 udevd 0.11% 416 pkill 0.11% 403 kworker/u4:0 0.04% 138 INP-FE-IP1 0.04% 136 INP-FE-IP0 0.03% 127 INP-FE-IP5 0.03% 121 INP-FE-IP2 0.03% 118 INP-FE-IP3 0.03% 107 ICS-Watchdog 0.03% 106 INP-FE-IP4 0.03% 101 VIB-HdmiRxMonit 0.01% 40 init 0.00% 11 ksoftirqd/1 0.00% 8 dtach 0.00% 6 rpcbind 0.00% 2 ICS-Nsrv 0.00% 2 :36 0.00% 1 ICS-Admin and here after results generated with Oprofile : CPU_CYCLES:100000| samples| %| ------------------ 1492238 76.8928 vmlinux 230774 11.8914 libc-2.14.1.so 107878 5.5588 dvbtest 55578 2.8639 player2 16198 0.8347 oprofiled 15934 0.8211 mme 7958 0.4101 stmcore_display_stiH407 3226 0.1662 ics 1975 0.1018 libpthread-2.14.1.so 1547 0.0797 sth264pp 1155 0.0595 ksound 1107 0.0570 libcrypto.so.1.0.0 899 0.0463 stm_event 832 0.0429 syslogd 466 0.0240 klogd 445 0.0229 ld-2.14.1.so 428 0.0221 strelayfs 416 0.0214 bash 378 0.0195 stm_wrapper 238 0.0123 stlinuxtv 206 0.0106 stm_registry 182 0.0094 stm_pixel_capture 140 0.0072 displaylink 131 0.0068 stm_memsrcsink 101 0.0052 stm_fe_ip 52 0.0027 sshd 37 0.0019 vibe_os 24 0.0012 hdmirx_stiH407 22 0.0011 libprocps.so.1.1.0 22 0.0011 osdev_abs 20 0.0010 ophelp 15 7.7e-04 gawk 15 7.7e-04 udevd 6 3.1e-04 libnss_files-2.14.1.so 4 2.1e-04 pkill 3 1.5e-04 grep 3 1.5e-04 sleep 3 1.5e-04 libm-2.14.1.so 3 1.5e-04 init 2 1.0e-04 libgcc_s-4.8.2.so.1 2 1.0e-04 libresolv-2.14.1.so 2 1.0e-04 libpopt.so.0.0.0 2 1.0e-04 libtirpc.so.1.0.10 1 5.2e-05 rm 1 5.2e-05 seq 1 5.2e-05 libbfd-2.23.2.so 1 5.2e-05 libgmp.so.10.1.3 1 5.2e-05 libwrap.so.0.7.6 My focus is on "dvbtest" application. As you see in the results, Oprofile shows 5.558% against 11.96% with Perf. Why results are so different? PS: I used Oprofile 0.97 version, because it's the version available in the environment where I have to test and could not use the versions later. Regards, |
From: Maynard J. <may...@us...> - 2014-09-08 18:33:55
|
On 09/02/2014 01:40 PM, Sabra Gargouri wrote: > > > > Hi, > I would like to profile my system when running my own application , so for that purpose I used Oprofile and also Perf tool. I run separately Oprofile and Perf with the same use case (same application and same duration). > When using Oprofile, I configured opcontrol with --event=CPU_ CYCLES:100000:0:1:1 and for perf , I used -c 100000 to be sure that both tools are sampling with the same event counter. > When generating results for each tool, I noticed a difference in the percentages. Below are results generated with Perf: > > # Events: 362K cycles > # > # Overhead Samples Command > # ........ .......... ............... > # > 31.40% 113991 kthreadd > 31.12% 112975 swapper > 11.96% 43418 dvbtest > 7.31% 26539 SE-Aud-Mixer > 4.34% 15750 kworker/0:1 > 3.33% 12070 klogd > 3.31% 12033 syslogd > 2.57% 9327 irq/140-vsync0 > 1.12% 4058 perf > 0.90% 3258 kworker/1:1 > 0.55% 1990 irq/141-vsync1 > 0.39% 1414 INF-EvtAsyncCb > 0.35% 1270 rcu_preempt > 0.31% 1108 VIB-Hpd/0 > 0.21% 768 ksoftirqd/0 > 0.19% 688 sshd > 0.14% 498 udevd > 0.11% 416 pkill > 0.11% 403 kworker/u4:0 > 0.04% 138 INP-FE-IP1 > 0.04% 136 INP-FE-IP0 > 0.03% 127 INP-FE-IP5 > 0.03% 121 INP-FE-IP2 > 0.03% 118 INP-FE-IP3 > 0.03% 107 ICS-Watchdog > 0.03% 106 INP-FE-IP4 > 0.03% 101 VIB-HdmiRxMonit > 0.01% 40 init > 0.00% 11 ksoftirqd/1 > 0.00% 8 dtach > 0.00% 6 rpcbind > 0.00% 2 ICS-Nsrv > 0.00% 2 :36 > 0.00% 1 ICS-Admin > and here after results generated with Oprofile : > > CPU_CYCLES:100000| > samples| %| > ------------------ > 1492238 76.8928 vmlinux > 230774 11.8914 libc-2.14.1.so > 107878 5.5588 dvbtest > 55578 2.8639 player2 > 16198 0.8347 oprofiled > 15934 0.8211 mme > 7958 0.4101 stmcore_display_stiH407 > 3226 0.1662 ics > 1975 0.1018 libpthread-2.14.1.so > 1547 0.0797 sth264pp > 1155 0.0595 ksound > 1107 0.0570 libcrypto.so.1.0.0 > 899 0.0463 stm_event > 832 0.0429 syslogd > 466 0.0240 klogd > 445 0.0229 ld-2.14.1.so > 428 0.0221 strelayfs > 416 0.0214 bash > 378 0.0195 stm_wrapper > 238 0.0123 stlinuxtv > 206 0.0106 stm_registry > 182 0.0094 stm_pixel_capture > 140 0.0072 displaylink > 131 0.0068 stm_memsrcsink > 101 0.0052 stm_fe_ip > 52 0.0027 sshd > 37 0.0019 vibe_os > 24 0.0012 hdmirx_stiH407 > 22 0.0011 libprocps.so.1.1.0 > 22 0.0011 osdev_abs > 20 0.0010 ophelp > 15 7.7e-04 gawk > 15 7.7e-04 udevd > 6 3.1e-04 libnss_files-2.14.1.so > 4 2.1e-04 pkill > 3 1.5e-04 grep > 3 1.5e-04 sleep > 3 1.5e-04 libm-2.14.1.so > 3 1.5e-04 init > 2 1.0e-04 libgcc_s-4.8.2.so.1 > 2 1.0e-04 libresolv-2.14.1.so > 2 1.0e-04 libpopt.so.0.0.0 > 2 1.0e-04 libtirpc.so.1.0.10 > 1 5.2e-05 rm > 1 5.2e-05 seq > 1 5.2e-05 libbfd-2.23.2.so > 1 5.2e-05 libgmp.so.10.1.3 > 1 5.2e-05 libwrap.so.0.7.6 > > My focus is on "dvbtest" application. As you see in the results, Oprofile shows 5.558% against 11.96% with Perf. > Why results are so different? I note that you are collecting system-wide profiles with perf and opcontrol. It's quite meaningless to compare the percentage of samples for a given application to the rest of the running system unless you have a very rigorous method for controlling what is running on that system -- even with a comparison of two profiles using the same profiler. However, comparing sample counts from two different profilers for a given app should, theoretically, provide comparable results. But there are caveats. The perf tool (actually, the kernel) may silently throttle back the sampling rate if it deems the overhead is too high -- and a sampling rate of 100000 is pretty high. Another issue that makes the comparison invalid in your case is that you are apparently running opcontrol without any separation parameters (i.e, kernel,lib), so you may have samples from kernel and/or libc (or other libs) that should properly be attributed to dvbtest, but are not. I don't know what your intent is in doing this comparison, but if your main concern is how to best profile your application, you should use the perf tool and *don't* use the system-wide "--all-cpus" mode. OProfile introduced a new profiling tool called 'operf' with release 0.9.8 which, like perf, allows profiling of a single app (vs the system-wide opcontrol), but since you say you're stuck with oprofile 0.9.7 (why can't you build/install a newer oprofile?), then perf is your best option. -Maynard > > PS: I used Oprofile 0.97 version, because it's the version available in the environment where I have to test and could not use the versions later. > > Regards, > > > > > ------------------------------------------------------------------------------ > Want excitement? > Manually upgrade your production database. > When you want reliability, choose Perforce > Perforce version control. Predictably reliable. > http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk > > > > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |