From: Maynard J. <may...@us...> - 2014-08-15 15:54:39
|
We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. Thanks. -Maynard Johnson ----------------------------------------------------------------- Release Notes =============== OProfile 1.0.0 Release Candidate 1 has been released. A major change in this release is the removal of the legacy opcontrol-based profiler. This legacy profiling tool has been deprecated since release 0.9.8 when operf was first introduced. The following components and processor types that were dependent on opcontrol have also been removed: - GUI component (i.e., oprof_start) - IBS events removed from AMD processors - All Alpha processors, except for EV67 (which *is* supported by operf/ocount) - Architecture avr32 - Architecture ia64 - Processor model IBM Cell - Processor model P.A. Semi PA6T - RTC (real time clock mode) OProfile users still running on any of these affected systems or needing any of the removed components listed above should not upgrade to OProfile release 1.0. Alternatively, you can obtain all of the new features, enhancements, and bug fixes described below and still have access to opcontrol, by doing the following: git clone git://git.code.sf.net/p/oprofile/oprofile oprofile cd oprofile git checkout PRE_RELEASE_1_0 and then build/install as usual. More information about OProfile can be seen at http://oprofile.sf.net Incompatibilities with previous release --------------------------------------- - Sample data collected with previous releases of OProfile are incompatible with release 1.0. - ophelp schema: Major version changed for removal of unit mask 'extra' attribute and addition of unit mask 'name'. New features ------------ - Enhance ocount to support millisecond time intervals - Obtain kernel symbols from /proc/kallsyms if no vmlinux file specified - New/updated Processor Support * (New) Freescale e6500 * (New) Freescale e500mc * (New) Intel Silvermont * (New) ARM ARMv7 Krait * (New) ARM ARMv8 (AArch64) * (New) Intel Broadwell * (New) ARM Cortex A57 * (New) ARM Cortex A53 * Added little endian support for IBM POWER8 * Update events for IBM POWER8 * Added edge-detect events for IBM POWER7 * Update events for Intel Haswell Bug fixes --------- Filed bug reports: ------------------------------------------------------------------------- | BUG ID | Summary |-----------|------------------------------------------------------------ | 236 | opreport schema: Fix count field maxOccurs (changed to | | 'unbounded') | 245 | Fix compile error on ppc/uClibc platform: 'AT_BASE_PLATFORM' | | undeclared' | 248 | Duplicate event specs passed to ocount show up twice in | | output | 252 | Fix operf/ocount default unit mask selection | 253 | ocount: print the unit mask, kernel and user modes if | | specified for the event | 254 | ophelp schema is not included in installed files | 255 | Remove unused 'extra' attribute from ophelp schema | 256 | opreport from 'operf --callgraph' profile shows false | | recursive calls | 257 | Fix handling of default named unit masks longer than 11 chars | 259 | Print unit mask name where applicable in ophelp XML output | 260 | Fix profiling of multi-threaded apps when using "--pid" | | option | 262 | Fix operf/opreport kernel throttling detection | 263 | Fix sample attribution problem when using multiple events | 266 | exclude/include files option doesn't work for opannotate -a ------------------------------------------------------------------------- Other bug fixes and improvements without a filed report (e.g., posted to the list): --------------- - Update Alpha EV67 CPU support and remove all other Alpha CPU support - operf main process improperly killing conversion process - Fix up S390 support to work with operf/ocount - Link ocount with librt for clock_gettime only when needed - Fix 'Invalid argument' running 'opcontrol --start --callgraph=<n>' in Timer mode - Make sure hypervisor is excluded from ocount and operf - operf log may over-report "sample address not in expected range for domain" - Allow root to remove old jitdump files from /tmp/.oprofile/jitdump - Remove opreport warnings for /no-vmlinux, [vdso], [hypervisor_bucket] not found - Fix event codes for marked architected events (IBM ppc64) - Make operf/ocount detect invalid timer mode from opcontrol - Reduce overhead of operf waiting for profiled app to end - Fix "Unable to open cpu_type file for reading" for IBM POWER7+ - Allow all native events for IBM POWER8 in POWER7 compat mode - Fix spurious "backtraces skipped due to no file mapping" log entries - Fix the units for the reported CPU frequency Known problems and limitations ------------------------- - When using operf to profile multiple events, the absolute number of events recorded may be substantially fewer than expected. This can be due to knwon bug in the Linux kernel's Performance Events Subsystem that was fixed sometime between Linux kernel version 3.1 and 3.5. |
From: Maucci, C. <cyr...@hp...> - 2014-08-15 20:08:37
|
Hello Maynard, So maybe a dumb remark. I've got a RHEL6.4 system that I've upgraded the kernel up to 3.15.9. perf tools work fine there. When I launch the below recompiled operf, it tells me : [root@nv34 oprofile-1.0.0-rc1]# operf Your kernel's Performance Events Subsystem does not support your processor type. How would perf not complain about that but operf would ? Thanks a lot ++Cyrille -----Original Message----- From: Maynard Johnson [mailto:may...@us...] Sent: Friday, August 15, 2014 5:54 PM To: oprofile-list Subject: Announcement: Release Candidate 1 for OProfile 1.0.0 We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. Thanks. -Maynard Johnson ----------------------------------------------------------------- Release Notes =============== OProfile 1.0.0 Release Candidate 1 has been released. A major change in this release is the removal of the legacy opcontrol-based profiler. This legacy profiling tool has been deprecated since release 0.9.8 when operf was first introduced. The following components and processor types that were dependent on opcontrol have also been removed: - GUI component (i.e., oprof_start) - IBS events removed from AMD processors - All Alpha processors, except for EV67 (which *is* supported by operf/ocount) - Architecture avr32 - Architecture ia64 - Processor model IBM Cell - Processor model P.A. Semi PA6T - RTC (real time clock mode) OProfile users still running on any of these affected systems or needing any of the removed components listed above should not upgrade to OProfile release 1.0. Alternatively, you can obtain all of the new features, enhancements, and bug fixes described below and still have access to opcontrol, by doing the following: git clone git://git.code.sf.net/p/oprofile/oprofile oprofile cd oprofile git checkout PRE_RELEASE_1_0 and then build/install as usual. More information about OProfile can be seen at http://oprofile.sf.net Incompatibilities with previous release --------------------------------------- - Sample data collected with previous releases of OProfile are incompatible with release 1.0. - ophelp schema: Major version changed for removal of unit mask 'extra' attribute and addition of unit mask 'name'. New features ------------ - Enhance ocount to support millisecond time intervals - Obtain kernel symbols from /proc/kallsyms if no vmlinux file specified - New/updated Processor Support * (New) Freescale e6500 * (New) Freescale e500mc * (New) Intel Silvermont * (New) ARM ARMv7 Krait * (New) ARM ARMv8 (AArch64) * (New) Intel Broadwell * (New) ARM Cortex A57 * (New) ARM Cortex A53 * Added little endian support for IBM POWER8 * Update events for IBM POWER8 * Added edge-detect events for IBM POWER7 * Update events for Intel Haswell Bug fixes --------- Filed bug reports: ------------------------------------------------------------------------- | BUG ID | Summary |-----------|----------------------------------------------------------- |-----------|- | 236 | opreport schema: Fix count field maxOccurs (changed to | | 'unbounded') | 245 | Fix compile error on ppc/uClibc platform: 'AT_BASE_PLATFORM' | | undeclared' | 248 | Duplicate event specs passed to ocount show up twice in | | output | 252 | Fix operf/ocount default unit mask selection | 253 | ocount: print the unit mask, kernel and user modes if | | specified for the event | 254 | ophelp schema is not included in installed files | 255 | Remove unused 'extra' attribute from ophelp schema | 256 | opreport from 'operf --callgraph' profile shows false | | recursive calls | 257 | Fix handling of default named unit masks longer than 11 chars | 259 | Print unit mask name where applicable in ophelp XML output | 260 | Fix profiling of multi-threaded apps when using "--pid" | | option | 262 | Fix operf/opreport kernel throttling detection | 263 | Fix sample attribution problem when using multiple events | 266 | exclude/include files option doesn't work for opannotate -a ------------------------------------------------------------------------- Other bug fixes and improvements without a filed report (e.g., posted to the list): --------------- - Update Alpha EV67 CPU support and remove all other Alpha CPU support - operf main process improperly killing conversion process - Fix up S390 support to work with operf/ocount - Link ocount with librt for clock_gettime only when needed - Fix 'Invalid argument' running 'opcontrol --start --callgraph=<n>' in Timer mode - Make sure hypervisor is excluded from ocount and operf - operf log may over-report "sample address not in expected range for domain" - Allow root to remove old jitdump files from /tmp/.oprofile/jitdump - Remove opreport warnings for /no-vmlinux, [vdso], [hypervisor_bucket] not found - Fix event codes for marked architected events (IBM ppc64) - Make operf/ocount detect invalid timer mode from opcontrol - Reduce overhead of operf waiting for profiled app to end - Fix "Unable to open cpu_type file for reading" for IBM POWER7+ - Allow all native events for IBM POWER8 in POWER7 compat mode - Fix spurious "backtraces skipped due to no file mapping" log entries - Fix the units for the reported CPU frequency Known problems and limitations ------------------------- - When using operf to profile multiple events, the absolute number of events recorded may be substantially fewer than expected. This can be due to knwon bug in the Linux kernel's Performance Events Subsystem that was fixed sometime between Linux kernel version 3.1 and 3.5. ------------------------------------------------------------------------------ _______________________________________________ oprofile-list mailing list opr...@li... https://lists.sourceforge.net/lists/listinfo/oprofile-list |
From: William C. <wc...@re...> - 2014-08-18 20:45:49
|
On 08/15/2014 11:54 AM, Maynard Johnson wrote: > We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: > http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ > > Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. > > Thanks. > -Maynard Johnson Hi, Anyone feeling adventurous on Fedora 20 can give the following koji build of oprofile-1.0.0-rc1 a try: http://koji.fedoraproject.org/koji/taskinfo?taskID=7413021 I also ran coverity on the same source rpm and the results looked good: Error: CHECKED_RETURN: /builddir/build/BUILD/oprofile-1.0.0-rc1/libperf_events/operf_counter.cpp:66: check_return: "read(int, void *, size_t)" returns the number of bytes read, but it is ignored. /builddir/build/BUILD/oprofile-1.0.0-rc1/libperf_events/operf_counter.cpp:66: cond_true: Condition "read(fd, &msg, 4UL /* sizeof (msg) */) > 0", taking true branch Error: DEADCODE: /builddir/build/BUILD/oprofile-1.0.0-rc1/libpp/image_errors.cpp:68: dead_error_condition: The switch value "error" cannot be "0". /builddir/build/BUILD/oprofile-1.0.0-rc1/libpp/image_errors.cpp:68: dead_error_begin: Execution cannot reach this statement "case 0:". Error: STREAM_FORMAT_STATE: /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:236: cond_false: Condition "!options::show_header", taking false branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:237: if_end: End of if statement /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:239: cond_true: Condition "indent", taking true branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:244: cond_true: Condition "i < classes.v.size()", taking true branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:246: cond_true: Condition "name.length() > colwidth", taking true branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:251: format_changed: "setf" changes the format state of "std::cout" for category adjustfield. /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:255: loop: Jumping back to the beginning of the loop /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:244: loop_begin: Jumped back to beginning of loop /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:244: cond_false: Condition "i < classes.v.size()", taking false branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:255: loop_end: Reached end of loop /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:258: cond_true: Condition "indent", taking true branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:261: cond_false: Condition "i < classes.v.size()", taking false branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:267: loop_end: Reached end of loop /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:271: cond_true: Condition "indent", taking true branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:274: cond_false: Condition "i < classes.v.size()", taking false branch /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:278: loop_end: Reached end of loop /builddir/build/BUILD/oprofile-1.0.0-rc1/pp/opreport.cpp:281: end_of_path: Changing format state of stream "std::cout" for category adjustfield without later restoring it. -Will > > ----------------------------------------------------------------- > > Release Notes > =============== > OProfile 1.0.0 Release Candidate 1 has been released. A major change in > this release is the removal of the legacy opcontrol-based profiler. > This legacy profiling tool has been deprecated since release 0.9.8 when > operf was first introduced. The following components and processor > types that were dependent on opcontrol have also been removed: > > - GUI component (i.e., oprof_start) > - IBS events removed from AMD processors > - All Alpha processors, except for EV67 (which *is* supported by operf/ocount) > - Architecture avr32 > - Architecture ia64 > - Processor model IBM Cell > - Processor model P.A. Semi PA6T > - RTC (real time clock mode) > > OProfile users still running on any of these affected systems or > needing any of the removed components listed above should not upgrade > to OProfile release 1.0. Alternatively, you can obtain all of the new > features, enhancements, and bug fixes described below and still have > access to opcontrol, by doing the following: > > git clone git://git.code.sf.net/p/oprofile/oprofile oprofile > cd oprofile > git checkout PRE_RELEASE_1_0 > > and then build/install as usual. > > More information about OProfile can be seen at > http://oprofile.sf.net > > > Incompatibilities with previous release > --------------------------------------- > > - Sample data collected with previous releases of OProfile are incompatible > with release 1.0. > - ophelp schema: Major version changed for removal of unit mask 'extra' > attribute and addition of unit mask 'name'. > > > New features > ------------ > > - Enhance ocount to support millisecond time intervals > - Obtain kernel symbols from /proc/kallsyms if no vmlinux file specified > > - New/updated Processor Support > * (New) Freescale e6500 > * (New) Freescale e500mc > * (New) Intel Silvermont > * (New) ARM ARMv7 Krait > * (New) ARM ARMv8 (AArch64) > * (New) Intel Broadwell > * (New) ARM Cortex A57 > * (New) ARM Cortex A53 > * Added little endian support for IBM POWER8 > * Update events for IBM POWER8 > * Added edge-detect events for IBM POWER7 > * Update events for Intel Haswell > > > Bug fixes > --------- > > Filed bug reports: > ------------------------------------------------------------------------- > | BUG ID | Summary > |-----------|------------------------------------------------------------ > | 236 | opreport schema: Fix count field maxOccurs (changed to > | | 'unbounded') > | 245 | Fix compile error on ppc/uClibc platform: 'AT_BASE_PLATFORM' > | | undeclared' > | 248 | Duplicate event specs passed to ocount show up twice in > | | output > | 252 | Fix operf/ocount default unit mask selection > | 253 | ocount: print the unit mask, kernel and user modes if > | | specified for the event > | 254 | ophelp schema is not included in installed files > | 255 | Remove unused 'extra' attribute from ophelp schema > | 256 | opreport from 'operf --callgraph' profile shows false > | | recursive calls > | 257 | Fix handling of default named unit masks longer than 11 chars > | 259 | Print unit mask name where applicable in ophelp XML output > | 260 | Fix profiling of multi-threaded apps when using "--pid" > | | option > | 262 | Fix operf/opreport kernel throttling detection > | 263 | Fix sample attribution problem when using multiple events > | 266 | exclude/include files option doesn't work for opannotate -a > ------------------------------------------------------------------------- > > Other bug fixes and improvements without a filed report (e.g., posted to the list): > --------------- > - Update Alpha EV67 CPU support and remove all other Alpha CPU support > - operf main process improperly killing conversion process > - Fix up S390 support to work with operf/ocount > - Link ocount with librt for clock_gettime only when needed > - Fix 'Invalid argument' running 'opcontrol --start --callgraph=<n>' in > Timer mode > - Make sure hypervisor is excluded from ocount and operf > - operf log may over-report "sample address not in expected range for > domain" > - Allow root to remove old jitdump files from /tmp/.oprofile/jitdump > - Remove opreport warnings for /no-vmlinux, [vdso], [hypervisor_bucket] > not found > - Fix event codes for marked architected events (IBM ppc64) > - Make operf/ocount detect invalid timer mode from opcontrol > - Reduce overhead of operf waiting for profiled app to end > - Fix "Unable to open cpu_type file for reading" for IBM POWER7+ > - Allow all native events for IBM POWER8 in POWER7 compat mode > - Fix spurious "backtraces skipped due to no file mapping" log entries > - Fix the units for the reported CPU frequency > > > Known problems and limitations > ------------------------- > - When using operf to profile multiple events, the absolute number of > events recorded may be substantially fewer than expected. This can be > due to knwon bug in the Linux kernel's Performance Events Subsystem that > was fixed sometime between Linux kernel version 3.1 and 3.5. > > > ------------------------------------------------------------------------------ > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: Maynard J. <may...@us...> - 2014-08-26 15:56:34
|
On 08/15/2014 10:54 AM, Maynard Johnson wrote: > We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: > http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ > > Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. Fellow oprofile architecture maintainers, Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by the end of the week. Thanks! -Maynard > > Thanks. > -Maynard Johnson > > ----------------------------------------------------------------- > > Release Notes > =============== > OProfile 1.0.0 Release Candidate 1 has been released. A major change in > this release is the removal of the legacy opcontrol-based profiler. > This legacy profiling tool has been deprecated since release 0.9.8 when > operf was first introduced. The following components and processor > types that were dependent on opcontrol have also been removed: > > - GUI component (i.e., oprof_start) > - IBS events removed from AMD processors > - All Alpha processors, except for EV67 (which *is* supported by operf/ocount) > - Architecture avr32 > - Architecture ia64 > - Processor model IBM Cell > - Processor model P.A. Semi PA6T > - RTC (real time clock mode) > > OProfile users still running on any of these affected systems or > needing any of the removed components listed above should not upgrade > to OProfile release 1.0. Alternatively, you can obtain all of the new > features, enhancements, and bug fixes described below and still have > access to opcontrol, by doing the following: > > git clone git://git.code.sf.net/p/oprofile/oprofile oprofile > cd oprofile > git checkout PRE_RELEASE_1_0 > > and then build/install as usual. > > More information about OProfile can be seen at > http://oprofile.sf.net > > > Incompatibilities with previous release > --------------------------------------- > > - Sample data collected with previous releases of OProfile are incompatible > with release 1.0. > - ophelp schema: Major version changed for removal of unit mask 'extra' > attribute and addition of unit mask 'name'. > > > New features > ------------ > > - Enhance ocount to support millisecond time intervals > - Obtain kernel symbols from /proc/kallsyms if no vmlinux file specified > > - New/updated Processor Support > * (New) Freescale e6500 > * (New) Freescale e500mc > * (New) Intel Silvermont > * (New) ARM ARMv7 Krait > * (New) ARM ARMv8 (AArch64) > * (New) Intel Broadwell > * (New) ARM Cortex A57 > * (New) ARM Cortex A53 > * Added little endian support for IBM POWER8 > * Update events for IBM POWER8 > * Added edge-detect events for IBM POWER7 > * Update events for Intel Haswell > > > Bug fixes > --------- > > Filed bug reports: > ------------------------------------------------------------------------- > | BUG ID | Summary > |-----------|------------------------------------------------------------ > | 236 | opreport schema: Fix count field maxOccurs (changed to > | | 'unbounded') > | 245 | Fix compile error on ppc/uClibc platform: 'AT_BASE_PLATFORM' > | | undeclared' > | 248 | Duplicate event specs passed to ocount show up twice in > | | output > | 252 | Fix operf/ocount default unit mask selection > | 253 | ocount: print the unit mask, kernel and user modes if > | | specified for the event > | 254 | ophelp schema is not included in installed files > | 255 | Remove unused 'extra' attribute from ophelp schema > | 256 | opreport from 'operf --callgraph' profile shows false > | | recursive calls > | 257 | Fix handling of default named unit masks longer than 11 chars > | 259 | Print unit mask name where applicable in ophelp XML output > | 260 | Fix profiling of multi-threaded apps when using "--pid" > | | option > | 262 | Fix operf/opreport kernel throttling detection > | 263 | Fix sample attribution problem when using multiple events > | 266 | exclude/include files option doesn't work for opannotate -a > ------------------------------------------------------------------------- > > Other bug fixes and improvements without a filed report (e.g., posted to the list): > --------------- > - Update Alpha EV67 CPU support and remove all other Alpha CPU support > - operf main process improperly killing conversion process > - Fix up S390 support to work with operf/ocount > - Link ocount with librt for clock_gettime only when needed > - Fix 'Invalid argument' running 'opcontrol --start --callgraph=<n>' in > Timer mode > - Make sure hypervisor is excluded from ocount and operf > - operf log may over-report "sample address not in expected range for > domain" > - Allow root to remove old jitdump files from /tmp/.oprofile/jitdump > - Remove opreport warnings for /no-vmlinux, [vdso], [hypervisor_bucket] > not found > - Fix event codes for marked architected events (IBM ppc64) > - Make operf/ocount detect invalid timer mode from opcontrol > - Reduce overhead of operf waiting for profiled app to end > - Fix "Unable to open cpu_type file for reading" for IBM POWER7+ > - Allow all native events for IBM POWER8 in POWER7 compat mode > - Fix spurious "backtraces skipped due to no file mapping" log entries > - Fix the units for the reported CPU frequency > > > Known problems and limitations > ------------------------- > - When using operf to profile multiple events, the absolute number of > events recorded may be substantially fewer than expected. This can be > due to knwon bug in the Linux kernel's Performance Events Subsystem that > was fixed sometime between Linux kernel version 3.1 and 3.5. > > > ------------------------------------------------------------------------------ > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: William C. <wc...@re...> - 2014-08-28 16:11:46
|
On 08/26/2014 11:56 AM, Maynard Johnson wrote: > On 08/15/2014 10:54 AM, Maynard Johnson wrote: >> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: >> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ >> >> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. > > Fellow oprofile architecture maintainers, > Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the > Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into > the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an > RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by > the end of the week. Thanks! > > -Maynard >> >> Thanks. >> -Maynard Johnson Hi Maynard, I built a rpm using the latest git repo tarball and have been testing that out on a number of different platforms. There are a number of xml failures in the test results (footnotes (1) and (2)). Have anyone else observed that? Or is this a local build issue? I also had some issues with the tests running on arm cortex-a9. The perf utility works on the machine but operf and ocount are not. This seems to be specific to the cortex a9 machine things work on a cortex a15 machine. # operf --events INST_RETIRED:500000:0:1:1,CPU_CYCLES:500000:0:1:1, workloads/thread_src/thread_bin perf_event_open failed with Operation not supported Caught runtime_error: Internal Error. Perf event setup failed. Error running profiler Below is a list of machine ran tests on and notes about the test results. -Will Fedora20 i386/ivybridge oprofile-operf tests failures (1) oprofile-single_process failure (1) x86_64/family10 no failures seen unlike the i386/ivybridge arm/ca15 oprofile-operf tests failures (1) oprofile-single_process failure (1) arm/ca9 ocount-cycle-check-run.exp failed because of CPU freq not reported ERROR: tcl error sourcing ./oprofile-operf/oprofile-operf-run.exp. perf does work on the machine, but for some reason oper was having problems setting up the hardware: rawhide in kvm guest copying host cpuid, host running fedora 20 i386/ivybridge ocount-cycle-check-run.exp failed because of Turbo mode oprofile-operf tests failures (1) oprofile-single_process failure (1) rhel7 i386/ivybridge oprofile-operf tests failures (1) oprofile-single_process failure (1) arm/ca57 ocount-cycle-check-run.exp failed because of CPU freq not reported oprofile-operf tests failures (1) oprofile-single_process failure (1) arm/xgene ocount-cycle-check-run.exp failed because of CPU freq not reported oprofile-operf tests failures (1) oprofile-single_process failure (1) rhel6 i386/westmere ocount-cycle-check-run.exp failed because of Turbo mode oprofile-operf tests failures (2) oprofile-single_process failure (2) (1) Running ./oprofile-operf/oprofile-operf-run.exp ... FAIL: XML opreport output with callgraph option=1 is invalid warning: failed to load external entity "/share/doc/oprofile/opreport.xsd" Schemas parser error : Failed to locate the main schema resource at '//share/doc /oprofile/opreport.xsd'. (2) Running ./oprofile-operf/oprofile-operf-run.exp ... FAIL: XML opreport output with callgraph option=1 is invalid out.xml:19: element count: Schemas validity error : Element 'count': This element is not expected. Expected is one of ( symbol, module ). out.xml fails to validate >> >> ----------------------------------------------------------------- >> >> Release Notes >> =============== >> OProfile 1.0.0 Release Candidate 1 has been released. A major change in >> this release is the removal of the legacy opcontrol-based profiler. >> This legacy profiling tool has been deprecated since release 0.9.8 when >> operf was first introduced. The following components and processor >> types that were dependent on opcontrol have also been removed: >> >> - GUI component (i.e., oprof_start) >> - IBS events removed from AMD processors >> - All Alpha processors, except for EV67 (which *is* supported by operf/ocount) >> - Architecture avr32 >> - Architecture ia64 >> - Processor model IBM Cell >> - Processor model P.A. Semi PA6T >> - RTC (real time clock mode) >> >> OProfile users still running on any of these affected systems or >> needing any of the removed components listed above should not upgrade >> to OProfile release 1.0. Alternatively, you can obtain all of the new >> features, enhancements, and bug fixes described below and still have >> access to opcontrol, by doing the following: >> >> git clone git://git.code.sf.net/p/oprofile/oprofile oprofile >> cd oprofile >> git checkout PRE_RELEASE_1_0 >> >> and then build/install as usual. >> >> More information about OProfile can be seen at >> http://oprofile.sf.net >> >> >> Incompatibilities with previous release >> --------------------------------------- >> >> - Sample data collected with previous releases of OProfile are incompatible >> with release 1.0. >> - ophelp schema: Major version changed for removal of unit mask 'extra' >> attribute and addition of unit mask 'name'. >> >> >> New features >> ------------ >> >> - Enhance ocount to support millisecond time intervals >> - Obtain kernel symbols from /proc/kallsyms if no vmlinux file specified >> >> - New/updated Processor Support >> * (New) Freescale e6500 >> * (New) Freescale e500mc >> * (New) Intel Silvermont >> * (New) ARM ARMv7 Krait >> * (New) ARM ARMv8 (AArch64) >> * (New) Intel Broadwell >> * (New) ARM Cortex A57 >> * (New) ARM Cortex A53 >> * Added little endian support for IBM POWER8 >> * Update events for IBM POWER8 >> * Added edge-detect events for IBM POWER7 >> * Update events for Intel Haswell >> >> >> Bug fixes >> --------- >> >> Filed bug reports: >> ------------------------------------------------------------------------- >> | BUG ID | Summary >> |-----------|------------------------------------------------------------ >> | 236 | opreport schema: Fix count field maxOccurs (changed to >> | | 'unbounded') >> | 245 | Fix compile error on ppc/uClibc platform: 'AT_BASE_PLATFORM' >> | | undeclared' >> | 248 | Duplicate event specs passed to ocount show up twice in >> | | output >> | 252 | Fix operf/ocount default unit mask selection >> | 253 | ocount: print the unit mask, kernel and user modes if >> | | specified for the event >> | 254 | ophelp schema is not included in installed files >> | 255 | Remove unused 'extra' attribute from ophelp schema >> | 256 | opreport from 'operf --callgraph' profile shows false >> | | recursive calls >> | 257 | Fix handling of default named unit masks longer than 11 chars >> | 259 | Print unit mask name where applicable in ophelp XML output >> | 260 | Fix profiling of multi-threaded apps when using "--pid" >> | | option >> | 262 | Fix operf/opreport kernel throttling detection >> | 263 | Fix sample attribution problem when using multiple events >> | 266 | exclude/include files option doesn't work for opannotate -a >> ------------------------------------------------------------------------- >> >> Other bug fixes and improvements without a filed report (e.g., posted to the list): >> --------------- >> - Update Alpha EV67 CPU support and remove all other Alpha CPU support >> - operf main process improperly killing conversion process >> - Fix up S390 support to work with operf/ocount >> - Link ocount with librt for clock_gettime only when needed >> - Fix 'Invalid argument' running 'opcontrol --start --callgraph=<n>' in >> Timer mode >> - Make sure hypervisor is excluded from ocount and operf >> - operf log may over-report "sample address not in expected range for >> domain" >> - Allow root to remove old jitdump files from /tmp/.oprofile/jitdump >> - Remove opreport warnings for /no-vmlinux, [vdso], [hypervisor_bucket] >> not found >> - Fix event codes for marked architected events (IBM ppc64) >> - Make operf/ocount detect invalid timer mode from opcontrol >> - Reduce overhead of operf waiting for profiled app to end >> - Fix "Unable to open cpu_type file for reading" for IBM POWER7+ >> - Allow all native events for IBM POWER8 in POWER7 compat mode >> - Fix spurious "backtraces skipped due to no file mapping" log entries >> - Fix the units for the reported CPU frequency >> >> >> Known problems and limitations >> ------------------------- >> - When using operf to profile multiple events, the absolute number of >> events recorded may be substantially fewer than expected. This can be >> due to knwon bug in the Linux kernel's Performance Events Subsystem that >> was fixed sometime between Linux kernel version 3.1 and 3.5. >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> oprofile-list mailing list >> opr...@li... >> https://lists.sourceforge.net/lists/listinfo/oprofile-list >> > > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list > |
From: Maynard J. <may...@us...> - 2014-08-28 17:22:32
|
On 08/28/2014 11:11 AM, William Cohen wrote: > On 08/26/2014 11:56 AM, Maynard Johnson wrote: >> On 08/15/2014 10:54 AM, Maynard Johnson wrote: >>> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: >>> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ >>> >>> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. >> >> Fellow oprofile architecture maintainers, >> Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the >> Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into >> the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an >> RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by >> the end of the week. Thanks! >> >> -Maynard >>> >>> Thanks. >>> -Maynard Johnson > > Hi Maynard, > > I built a rpm using the latest git repo tarball and have been testing that out on a number of different platforms. > > There are a number of xml failures in the test results (footnotes (1) and (2)). Have anyone else observed that? Or is this a local build issue? > > I also had some issues with the tests running on arm cortex-a9. The perf utility works on the machine but operf and ocount are not. This seems to be specific to the cortex a9 machine things work on a cortex a15 machine. > > # operf --events INST_RETIRED:500000:0:1:1,CPU_CYCLES:500000:0:1:1, workloads/thread_src/thread_bin > perf_event_open failed with Operation not supported > Caught runtime_error: Internal Error. Perf event setup failed. > Error running profiler > > Below is a list of machine ran tests on and notes about the test results. Will, Many thanks for all the testing you've done! I will defer to Will Deacon on the problem with running operf/ocount on cortex-a9, but I have answers for the XML-related test failures for you. See below. > > -Will > > (1) > Running ./oprofile-operf/oprofile-operf-run.exp ... > FAIL: XML opreport output with callgraph option=1 is invalid > warning: failed to load external entity "/share/doc/oprofile/opreport.xsd" > Schemas parser error : Failed to locate the main schema resource at '//share/doc > /oprofile/opreport.xsd'. Coincidentally, I have recently seen this error reported to me by one of my IBM colleagues when testing the RC1 on ubuntu. Turned out that, for some reason, the deb package had opreport.xsd installed as a tar file (opreport.xsd.gz) instead of plain text. Probably something like that happening here, too -- or simply missing from the install. > > (2) > Running ./oprofile-operf/oprofile-operf-run.exp ... > FAIL: XML opreport output with callgraph option=1 is invalid > out.xml:19: element count: Schemas validity error : Element 'count': This element is not expected. Expected is one of ( symbol, module ). > out.xml fails to validate This is exactly the error you'd see if a new XML instance doc was being validated against an old schema (e.g, from 0.9.9). There was a fix to the cardinality of the 'count' field in the opreport schema (fixed in bug 236) for release 1.0.0. opreport schema: Fix count field maxOccurs (changed to 'unbounded') Somehow, the testsuite is finding an old opreport schema instead of the new 1.0.0 schema file. -Maynard > >>> >>> ----------------------------------------------------------------- >>> >>> Release Notes >>> =============== >>> OProfile 1.0.0 Release Candidate 1 has been released. A major change in >>> this release is the removal of the legacy opcontrol-based profiler. >>> This legacy profiling tool has been deprecated since release 0.9.8 when >>> operf was first introduced. The following components and processor >>> types that were dependent on opcontrol have also been removed: >>> >>> - GUI component (i.e., oprof_start) >>> - IBS events removed from AMD processors >>> - All Alpha processors, except for EV67 (which *is* supported by operf/ocount) >>> - Architecture avr32 >>> - Architecture ia64 >>> - Processor model IBM Cell >>> - Processor model P.A. Semi PA6T >>> - RTC (real time clock mode) >>> >>> OProfile users still running on any of these affected systems or >>> needing any of the removed components listed above should not upgrade >>> to OProfile release 1.0. Alternatively, you can obtain all of the new >>> features, enhancements, and bug fixes described below and still have >>> access to opcontrol, by doing the following: >>> >>> git clone git://git.code.sf.net/p/oprofile/oprofile oprofile >>> cd oprofile >>> git checkout PRE_RELEASE_1_0 >>> >>> and then build/install as usual. >>> >>> More information about OProfile can be seen at >>> http://oprofile.sf.net >>> >>> >>> Incompatibilities with previous release >>> --------------------------------------- >>> >>> - Sample data collected with previous releases of OProfile are incompatible >>> with release 1.0. >>> - ophelp schema: Major version changed for removal of unit mask 'extra' >>> attribute and addition of unit mask 'name'. >>> >>> >>> New features >>> ------------ >>> >>> - Enhance ocount to support millisecond time intervals >>> - Obtain kernel symbols from /proc/kallsyms if no vmlinux file specified >>> >>> - New/updated Processor Support >>> * (New) Freescale e6500 >>> * (New) Freescale e500mc >>> * (New) Intel Silvermont >>> * (New) ARM ARMv7 Krait >>> * (New) ARM ARMv8 (AArch64) >>> * (New) Intel Broadwell >>> * (New) ARM Cortex A57 >>> * (New) ARM Cortex A53 >>> * Added little endian support for IBM POWER8 >>> * Update events for IBM POWER8 >>> * Added edge-detect events for IBM POWER7 >>> * Update events for Intel Haswell >>> >>> >>> Bug fixes >>> --------- >>> >>> Filed bug reports: >>> ------------------------------------------------------------------------- >>> | BUG ID | Summary >>> |-----------|------------------------------------------------------------ >>> | 236 | opreport schema: Fix count field maxOccurs (changed to >>> | | 'unbounded') >>> | 245 | Fix compile error on ppc/uClibc platform: 'AT_BASE_PLATFORM' >>> | | undeclared' >>> | 248 | Duplicate event specs passed to ocount show up twice in >>> | | output >>> | 252 | Fix operf/ocount default unit mask selection >>> | 253 | ocount: print the unit mask, kernel and user modes if >>> | | specified for the event >>> | 254 | ophelp schema is not included in installed files >>> | 255 | Remove unused 'extra' attribute from ophelp schema >>> | 256 | opreport from 'operf --callgraph' profile shows false >>> | | recursive calls >>> | 257 | Fix handling of default named unit masks longer than 11 chars >>> | 259 | Print unit mask name where applicable in ophelp XML output >>> | 260 | Fix profiling of multi-threaded apps when using "--pid" >>> | | option >>> | 262 | Fix operf/opreport kernel throttling detection >>> | 263 | Fix sample attribution problem when using multiple events >>> | 266 | exclude/include files option doesn't work for opannotate -a >>> ------------------------------------------------------------------------- >>> >>> Other bug fixes and improvements without a filed report (e.g., posted to the list): >>> --------------- >>> - Update Alpha EV67 CPU support and remove all other Alpha CPU support >>> - operf main process improperly killing conversion process >>> - Fix up S390 support to work with operf/ocount >>> - Link ocount with librt for clock_gettime only when needed >>> - Fix 'Invalid argument' running 'opcontrol --start --callgraph=<n>' in >>> Timer mode >>> - Make sure hypervisor is excluded from ocount and operf >>> - operf log may over-report "sample address not in expected range for >>> domain" >>> - Allow root to remove old jitdump files from /tmp/.oprofile/jitdump >>> - Remove opreport warnings for /no-vmlinux, [vdso], [hypervisor_bucket] >>> not found >>> - Fix event codes for marked architected events (IBM ppc64) >>> - Make operf/ocount detect invalid timer mode from opcontrol >>> - Reduce overhead of operf waiting for profiled app to end >>> - Fix "Unable to open cpu_type file for reading" for IBM POWER7+ >>> - Allow all native events for IBM POWER8 in POWER7 compat mode >>> - Fix spurious "backtraces skipped due to no file mapping" log entries >>> - Fix the units for the reported CPU frequency >>> >>> >>> Known problems and limitations >>> ------------------------- >>> - When using operf to profile multiple events, the absolute number of >>> events recorded may be substantially fewer than expected. This can be >>> due to knwon bug in the Linux kernel's Performance Events Subsystem that >>> was fixed sometime between Linux kernel version 3.1 and 3.5. >>> >>> >>> ------------------------------------------------------------------------------ >>> _______________________________________________ >>> oprofile-list mailing list >>> opr...@li... >>> https://lists.sourceforge.net/lists/listinfo/oprofile-list >>> >> >> >> ------------------------------------------------------------------------------ >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/ >> _______________________________________________ >> oprofile-list mailing list >> opr...@li... >> https://lists.sourceforge.net/lists/listinfo/oprofile-list >> > |
From: William C. <wc...@re...> - 2014-08-29 16:38:14
|
On 08/28/2014 01:22 PM, Maynard Johnson wrote: > On 08/28/2014 11:11 AM, William Cohen wrote: > Will, > Many thanks for all the testing you've done! I will defer to Will Deacon on the problem with running operf/ocount on cortex-a9, but I have answers for the XML-related test failures for you. See below. >> >> -Will >> >> (1) >> Running ./oprofile-operf/oprofile-operf-run.exp ... >> FAIL: XML opreport output with callgraph option=1 is invalid >> warning: failed to load external entity "/share/doc/oprofile/opreport.xsd" >> Schemas parser error : Failed to locate the main schema resource at '//share/doc >> /oprofile/opreport.xsd'. > Coincidentally, I have recently seen this error reported to me by one of my IBM colleagues when testing the RC1 on ubuntu. Turned out that, for some reason, the deb package had opreport.xsd installed as a tar file (opreport.xsd.gz) instead of plain text. Probably something like that happening here, too -- or simply missing from the install. There does seems to be an opreport.xsd installed by the rpm, but it looks like something is looking in the wrong place, /share/doc/oprofile.xsd: $ rpm -qs oprofile|grep xsd normal /usr/share/doc/oprofile/ophelp.xsd normal /usr/share/doc/oprofile/opreport.xsd The wierd thing is that on one f20 x86_64 machine it is working, but on another it doesn't. >> >> (2) >> Running ./oprofile-operf/oprofile-operf-run.exp ... >> FAIL: XML opreport output with callgraph option=1 is invalid >> out.xml:19: element count: Schemas validity error : Element 'count': This element is not expected. Expected is one of ( symbol, module ). >> out.xml fails to validate > This is exactly the error you'd see if a new XML instance doc was being validated against an old schema (e.g, from 0.9.9). There was a fix to the cardinality of the 'count' field in the opreport schema (fixed in bug 236) for release 1.0.0. > opreport schema: Fix count field maxOccurs (changed to 'unbounded') > Somehow, the testsuite is finding an old opreport schema instead of the new 1.0.0 schema file. It might be using the something in /usr/local/. I went through and removed the various oprofile stuff in /usr/local/. I then found that there was /usr/share/doc/oprofile and /usr/share/oprofile/doc/oprofile-1.0.0git. I copied over the newer files in /usr/share/doc/oprofile-1.0.0git into /usr/share/doc/oprofile. The tests are now happy on that machine. -Will |
From: William C. <wc...@re...> - 2014-08-29 18:36:02
|
On 08/29/2014 12:37 PM, William Cohen wrote: > On 08/28/2014 01:22 PM, Maynard Johnson wrote: >> On 08/28/2014 11:11 AM, William Cohen wrote: > >> Will, >> Many thanks for all the testing you've done! I will defer to Will Deacon on the problem with running operf/ocount on cortex-a9, but I have answers for the XML-related test failures for you. See below. >>> >>> -Will >>> >>> (1) >>> Running ./oprofile-operf/oprofile-operf-run.exp ... >>> FAIL: XML opreport output with callgraph option=1 is invalid >>> warning: failed to load external entity "/share/doc/oprofile/opreport.xsd" >>> Schemas parser error : Failed to locate the main schema resource at '//share/doc >>> /oprofile/opreport.xsd'. >> Coincidentally, I have recently seen this error reported to me by one of my IBM colleagues when testing the RC1 on ubuntu. Turned out that, for some reason, the deb package had opreport.xsd installed as a tar file (opreport.xsd.gz) instead of plain text. Probably something like that happening here, too -- or simply missing from the install. > > There does seems to be an opreport.xsd installed by the rpm, but it looks like something is looking in the wrong place, /share/doc/oprofile.xsd: > > $ rpm -qs oprofile|grep xsd > normal /usr/share/doc/oprofile/ophelp.xsd > normal /usr/share/doc/oprofile/opreport.xsd > > The wierd thing is that on one f20 x86_64 machine it is working, but on another it doesn't. I did some additional poking around and found why it was working on one machine but not the other. The testsuite/lib/verify.exp has the following snippet determine where to look for the opreport.xsd in validate_xml_report: set binpath [lindex [local_exec "which operf" "" "" 10] 1] set idx [string last "/bin" $binpath ] set op_install_dir [string range $binpath 0 $idx ] append schema_file $op_install_dir "/share/doc/oprofile/opreport.xsd" set opreport_result [local_exec "opreport --debug-info --symbols $cg_option --long-filenames --xml -o out.xml" "" "" 100 ] set xmllint_result [local_exec "xmllint --noout --schema $schema_file out.xml" "" "" 10 ] the oprofile rpm installs operf in /usr/bin/operf. However fedora has the following symbollic link so /usr/bin/operf also is visible as /bin/operf: $ ls -ld /bin lrwxrwxrwx. 1 root root 7 Apr 17 14:23 /bin -> usr/bin On the machine that works the $PATH is # echo $PATH /usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin On the machines with problems the $PATH is: # echo $PATH /usr/lib64/ccache:/sbin:/bin:/usr/sbin:/usr/bin For some reason logging directly in as root gave the first $PATH that worked, while "sudo bash" gave resulted in the second $PATH. Thus, for the second $PATH the test will find /bin/operf before /usr/bin/operf. This messes up generation of schema_file path. I guess I will just need to use "su" rather than using "sudo bash" or "su -" -Will |
From: Will D. <wil...@ar...> - 2014-08-28 18:07:53
|
On Thu, Aug 28, 2014 at 06:22:19PM +0100, Maynard Johnson wrote: > On 08/28/2014 11:11 AM, William Cohen wrote: > > On 08/26/2014 11:56 AM, Maynard Johnson wrote: > >> On 08/15/2014 10:54 AM, Maynard Johnson wrote: > >>> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: > >>> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ > >>> > >>> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. > >> > >> Fellow oprofile architecture maintainers, > >> Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the > >> Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into > >> the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an > >> RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by > >> the end of the week. Thanks! > >> > >> -Maynard > >>> > >>> Thanks. > >>> -Maynard Johnson > > > > Hi Maynard, > > > > I built a rpm using the latest git repo tarball and have been testing that out on a number of different platforms. > > > > There are a number of xml failures in the test results (footnotes (1) and (2)). Have anyone else observed that? Or is this a local build issue? > > > > I also had some issues with the tests running on arm cortex-a9. The perf utility works on the machine but operf and ocount are not. This seems to be specific to the cortex a9 machine things work on a cortex a15 machine. > > > > # operf --events INST_RETIRED:500000:0:1:1,CPU_CYCLES:500000:0:1:1, workloads/thread_src/thread_bin > > perf_event_open failed with Operation not supported > > Caught runtime_error: Internal Error. Perf event setup failed. > > Error running profiler > > > > Below is a list of machine ran tests on and notes about the test results. > Will, > Many thanks for all the testing you've done! I will defer to Will Deacon > on the problem with running operf/ocount on cortex-a9, but I have answers > for the XML-related test failures for you. See below. Thanks for giving this a whirl. Which was the A9 SoC you used? The PMU interrupts are often not described properly on those, so it could simply be that perf hasn't initialised. Will |
From: William C. <wc...@re...> - 2014-08-28 18:24:38
|
On 08/28/2014 02:07 PM, Will Deacon wrote: > On Thu, Aug 28, 2014 at 06:22:19PM +0100, Maynard Johnson wrote: >> On 08/28/2014 11:11 AM, William Cohen wrote: >>> Below is a list of machine ran tests on and notes about the test results. >> Will, >> Many thanks for all the testing you've done! I will defer to Will Deacon >> on the problem with running operf/ocount on cortex-a9, but I have answers >> for the XML-related test failures for you. See below. > > Thanks for giving this a whirl. Which was the A9 SoC you used? The PMU > interrupts are often not described properly on those, so it could simply be > that perf hasn't initialised. > > Will > Hi Will, The working cortex a15 machine is using a locally built 3.15.10 kernel running fedora 20. Here are some details on the fedora 20 arm cortex a9 machine: [wcohen@localhost ~]$ uname -a Linux localhost 3.15.10-200.fc20.armv7hl #1 SMP Thu Aug 14 16:37:46 UTC 2014 armv7l armv7l armv7l GNU/Linux [wcohen@localhost ~]$ more /proc/cpuinfo processor : 0 model name : ARMv7 Processor rev 0 (v7l) Features : swp half thumb fastmult vfp edsp thumbee vfpv3 vfpv3d16 tls CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x1 CPU part : 0xc09 CPU revision : 0 processor : 1 model name : ARMv7 Processor rev 0 (v7l) Features : swp half thumb fastmult vfp edsp thumbee vfpv3 vfpv3d16 tls CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x1 CPU part : 0xc09 CPU revision : 0 Hardware : NVIDIA Tegra SoC (Flattened Device Tree) Revision : 0000 Serial : 0000000000000000 The perf utils seem to work on the machine: $ perf record ls ... [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (~502 samples) ] $ perf stat ls ... Performance counter stats for 'ls': 6.077000 task-clock (msec) # 0.401 CPUs utilized 15 context-switches # 0.002 M/sec 0 cpu-migrations # 0.000 K/sec 96 page-faults # 0.016 M/sec 6,003,324 cycles # 0.988 GHz 1,085,235 stalled-cycles-frontend # 18.08% frontend cycles idle 4,600,125 stalled-cycles-backend # 76.63% backend cycles idle 2,117,086 instructions # 0.35 insns per cycle # 2.17 stalled cycles per insn 220,666 branches # 36.312 M/sec 94,593 branch-misses # 42.87% of all branches 0.015141693 seconds time elapsed However, the oprofile commands do not: $ operf ls perf_event_open failed with Operation not supported Caught runtime_error: Internal Error. Perf event setup failed. Error running profiler $ ocount ls perf_event_open failed with Operation not supported Caught runtime error while setting up counters Internal Error. Perf event setup failed. Error running ocount -Will |
From: William C. <wc...@re...> - 2014-08-28 20:40:02
|
On 08/28/2014 04:16 PM, Arnaldo Carvalho de Melo wrote: > Em Thu, Aug 28, 2014 at 03:54:58PM -0400, William Cohen escreveu: >> On 08/28/2014 02:59 PM, Will Deacon wrote: >>> On Thu, Aug 28, 2014 at 07:24:07PM +0100, William Cohen wrote: >>>> On 08/28/2014 02:07 PM, Will Deacon wrote: >>>>> Thanks for giving this a whirl. Which was the A9 SoC you used? The PMU >>>>> interrupts are often not described properly on those, so it could simply be >>>>> that perf hasn't initialised. > >> Hi Will, > >> Yes, there are cases where dtb doesn't descrbibe the performance >> monitoring hardware/interrupts properly, but in this case a compulab >> trimslice with nvidia tegra2 process it appears "perf record ls" seems >> to work correctly and "perf report" provides sane data. > >> Below is before and after of /proc/interrupts of the "perf record". I >> suspect that there is some difference in the way that operf and ocount >> are trying to set up the events when compared to "perf record" and >> "perf stat. > >> I should use systemtap to look at the various parameters being passed >> in to set up perf and determine where "perf" and "operf" diverge. > > You can try using perf evlist to see how 'perf record' sets up the event > attributes: > > [root@zoo ~]# perf record usleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.012 MB perf.data (~525 samples) ] > [root@zoo ~]# perf evlist -v > cycles: sample_freq=4000, size: 96, sample_type: IP|TID|TIME|PERIOD, > disabled: 1, inherit: 1, mmap: 1, mmap2: 1, comm: 1, comm_exec: 1, freq: > 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1 > [root@zoo ~]# > > - Arnaldo > oprofile uses raw events to set up the pmu hardware. Both the cortex a9 and cortex a15 should be using the same basic setup. A place setup differs is the processor specific code in the kernel. Also the kernel kernel on the a15 is a locally built stock 3.15.10 kernel, while the cortex a9 machine is running a fedora kernel that may have patches and different config. I rolled back to the oprofile-0.9.9-2.fc20.armv7hl rpm and operf works. So it looks like there is some in the oprofile userspace code regression. Doing git bisect to see where things broke. -Will |
From: William C. <wc...@re...> - 2014-08-29 01:30:37
|
On 08/28/2014 04:47 PM, Maynard Johnson wrote: > On 08/28/2014 11:11 AM, William Cohen wrote: >> On 08/26/2014 11:56 AM, Maynard Johnson wrote: >>> On 08/15/2014 10:54 AM, Maynard Johnson wrote: >>>> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: >>>> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ >>>> >>>> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. >>> >>> Fellow oprofile architecture maintainers, >>> Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the >>> Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into >>> the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an >>> RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by >>> the end of the week. Thanks! >>> >>> -Maynard >>>> >>>> Thanks. >>>> -Maynard Johnson >> >> Hi Maynard, >> >> I built a rpm using the latest git repo tarball and have been testing that out on a number of different platforms. >> >> There are a number of xml failures in the test results (footnotes (1) and (2)). Have anyone else observed that? Or is this a local build issue? >> >> I also had some issues with the tests running on arm cortex-a9. The perf utility works on the machine but operf and ocount are not. This seems to be specific to the cortex a9 machine things work on a cortex a15 machine. >> >> # operf --events INST_RETIRED:500000:0:1:1,CPU_CYCLES:500000:0:1:1, workloads/thread_src/thread_bin >> perf_event_open failed with Operation not supported >> Caught runtime_error: Internal Error. Perf event setup failed. >> Error running profiler >> > This smells very familiar. Looking at kernel code (arch/arm/kernel/perf_event.c), I see the following: > > /* > * Check whether we need to exclude the counter from certain modes. > */ > if ((!armpmu->set_event_filter || > armpmu->set_event_filter(hwc, &event->attr)) && > event_requires_mode_exclusion(&event->attr)) { > pr_debug("ARM performance counters do not support " > "mode exclusion\n"); > return -EOPNOTSUPP; > } > > I've made changes to oprofile recently so that hypervisor events would be excluded by default, > since we currently have nothing in our display formats to indicate hypervisor samples and nothing > in our event specification (passed to operf and ocount) to either include or exclude hypervisor > events. This bit our s390 friends, and we had to add #ifdef __s390__ in > libperf_events/operf_counter.cpp:221 and pe_counting/ocount_counter.cpp:72. > > *Will C*, can you please try to add '#ifdef __arm__' in the same two places and see if the oprofile > testsuite works then. Perhaps I should back out the change I made to exclude hypervisor events. > The issue is that on systems where mode exclusion *is* supported, and where the system is running > on top of a hypervisor, doing something like 'ocount -e CYCLES:0:0:1' ends up counting *both* userspace > and hypervisor events and the output of ocount does not separate them. A similar issue occurs with operf, > but the problem is not quite so pronounced, since opreport typically cannot be resolve hypervisor sample > addresses symbols. > > > -Maynard I did a git bisection and narrowed the problem patch to: $ git bisect bad 3f93a3b306875ff5591149a23034fed92a0844d7 is the first bad commit commit 3f93a3b306875ff5591149a23034fed92a0844d7 Author: Maynard Johnson <may...@us...> Date: Thu Aug 7 15:58:23 2014 -0500 Exclude collecting hypervisor samples for default event In a July 7 commit, I made the following change: Make sure hypervisor is excluded from ocount and operf Since we have no interface support in the event specification to allow the user to select or de-select counting events in hypervisor, and also since the output of ocount and opreport do not support the concept of hypervisor, we should exclude hypervisor from counting and profiling. There's a bug in the current code such that the user may or may not get hypervisor events included. This patch explicitly excludes hypervisor. Apparently, I neglected making the corresponding change for the default event. This patch rectifies that mistake. Signed-off-by: Maynard Johnson <may...@us...> One thing to note is that the tests worked on machines that had virtualization support. All the x86 rhel and fedora support virtualization. The working arm kernel for the cortex a15 kernel also has virtualization support. The arm f20 kernel doesn't have the virtualization enabled. -Will |
From: Will D. <wil...@ar...> - 2014-08-29 08:54:23
|
On Fri, Aug 29, 2014 at 02:30:01AM +0100, William Cohen wrote: > On 08/28/2014 04:47 PM, Maynard Johnson wrote: > > *Will C*, can you please try to add '#ifdef __arm__' in the same two > > places and see if the oprofile testsuite works then. Perhaps I should > > back out the change I made to exclude hypervisor events. The issue is > > that on systems where mode exclusion *is* supported, and where the > > system is running on top of a hypervisor, doing something like 'ocount > > -e CYCLES:0:0:1' ends up counting *both* userspace and hypervisor events > > and the output of ocount does not separate them. A similar issue occurs > > with operf, but the problem is not quite so pronounced, since opreport > > typically cannot be resolve hypervisor sample addresses symbols. > > I did a git bisection and narrowed the problem patch to: > > $ git bisect bad > 3f93a3b306875ff5591149a23034fed92a0844d7 is the first bad commit > commit 3f93a3b306875ff5591149a23034fed92a0844d7 > Author: Maynard Johnson <may...@us...> > Date: Thu Aug 7 15:58:23 2014 -0500 > > Exclude collecting hypervisor samples for default event > > In a July 7 commit, I made the following change: > > Make sure hypervisor is excluded from ocount and operf > > Since we have no interface support in the event specification to > allow the user to select or de-select counting events in hypervisor, > and also since the output of ocount and opreport do not support the > concept of hypervisor, we should exclude hypervisor from counting > and profiling. There's a bug in the current code such that the > user may or may not get hypervisor events included. This patch > explicitly excludes hypervisor. > > Apparently, I neglected making the corresponding change for the default event. > This patch rectifies that mistake. > > Signed-off-by: Maynard Johnson <may...@us...> > > One thing to note is that the tests worked on machines that had > virtualization support. All the x86 rhel and fedora support > virtualization. The working arm kernel for the cortex a15 kernel also has > virtualization support. The arm f20 kernel doesn't have the > virtualization enabled. Indeed, I'd expect the vast majority of arm64 systems to support this feature, as well as any new 32-bit cores (including A7, A12, A15 and A17). It's a pity that it's not easily probable... Will |
From: William C. <wc...@re...> - 2014-08-29 16:03:24
Attachments:
oprofile_3.15.10-200.fc20.armv7hl.txt
|
On 08/29/2014 09:28 AM, Maynard Johnson wrote: > On 08/28/2014 08:30 PM, William Cohen wrote: >> On 08/28/2014 04:47 PM, Maynard Johnson wrote: >>> On 08/28/2014 11:11 AM, William Cohen wrote: >>>> On 08/26/2014 11:56 AM, Maynard Johnson wrote: >>>>> On 08/15/2014 10:54 AM, Maynard Johnson wrote: >>>>>> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: >>>>>> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ >>>>>> >>>>>> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. >>>>> >>>>> Fellow oprofile architecture maintainers, >>>>> Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the >>>>> Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into >>>>> the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an >>>>> RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by >>>>> the end of the week. Thanks! >>>>> >>>>> -Maynard >>>>>> >>>>>> Thanks. >>>>>> -Maynard Johnson >>>> >>>> Hi Maynard, >>>> >>>> I built a rpm using the latest git repo tarball and have been testing that out on a number of different platforms. >>>> >>>> There are a number of xml failures in the test results (footnotes (1) and (2)). Have anyone else observed that? Or is this a local build issue? >>>> >>>> I also had some issues with the tests running on arm cortex-a9. The perf utility works on the machine but operf and ocount are not. This seems to be specific to the cortex a9 machine things work on a cortex a15 machine. >>>> >>>> # operf --events INST_RETIRED:500000:0:1:1,CPU_CYCLES:500000:0:1:1, workloads/thread_src/thread_bin >>>> perf_event_open failed with Operation not supported >>>> Caught runtime_error: Internal Error. Perf event setup failed. >>>> Error running profiler >>>> >>> This smells very familiar. Looking at kernel code (arch/arm/kernel/perf_event.c), I see the following: >>> >>> /* >>> * Check whether we need to exclude the counter from certain modes. >>> */ >>> if ((!armpmu->set_event_filter || >>> armpmu->set_event_filter(hwc, &event->attr)) && >>> event_requires_mode_exclusion(&event->attr)) { >>> pr_debug("ARM performance counters do not support " >>> "mode exclusion\n"); >>> return -EOPNOTSUPP; >>> } >>> >>> I've made changes to oprofile recently so that hypervisor events would be excluded by default, >>> since we currently have nothing in our display formats to indicate hypervisor samples and nothing >>> in our event specification (passed to operf and ocount) to either include or exclude hypervisor >>> events. This bit our s390 friends, and we had to add #ifdef __s390__ in >>> libperf_events/operf_counter.cpp:221 and pe_counting/ocount_counter.cpp:72. >>> >>> *Will C*, can you please try to add '#ifdef __arm__' in the same two places and see if the oprofile >>> testsuite works then. Perhaps I should back out the change I made to exclude hypervisor events. >>> The issue is that on systems where mode exclusion *is* supported, and where the system is running >>> on top of a hypervisor, doing something like 'ocount -e CYCLES:0:0:1' ends up counting *both* userspace >>> and hypervisor events and the output of ocount does not separate them. A similar issue occurs with operf, >>> but the problem is not quite so pronounced, since opreport typically cannot be resolve hypervisor sample >>> addresses symbols. >>> >>> >>> -Maynard >> >> I did a git bisection and narrowed the problem patch to: >> >> $ git bisect bad >> 3f93a3b306875ff5591149a23034fed92a0844d7 is the first bad commit >> commit 3f93a3b306875ff5591149a23034fed92a0844d7 >> Author: Maynard Johnson <may...@us...> >> Date: Thu Aug 7 15:58:23 2014 -0500 >> >> Exclude collecting hypervisor samples for default event > > Yeah, as I suspected. Will, I just posted a patch ("[PATCH] Back out recent change to > exclude hypervisor samples and counts") to fix this. Can you please try it out. > Hopefully you can give me feedback on it today. I'd like to put out a release candidate 2 > today, since I'll be out of the office all next week. > > Thanks! > -Maynard Hi Maynard, I built rpms from the current oprofile git repo and included the patch (http://koji.fedoraproject.org/koji/taskinfo?taskID=7485311). I reran the tests on the cortex a15, amd family 10, and ivybridge machine and verified there weren't any changes in the test results. For Fedora arm cortex a9 there was some improvements. All the cycle-check-ocount tests pass now and more of the other tests pass. However, there are still more failures on the f20 cortex a9 than the other machine, most often a "Non-zero count for workload ...". I have attached the output of tests for the cortex a9. -Will |
From: Maynard J. <may...@us...> - 2014-08-29 20:02:03
|
On 08/29/2014 11:02 AM, William Cohen wrote: > On 08/29/2014 09:28 AM, Maynard Johnson wrote: >> On 08/28/2014 08:30 PM, William Cohen wrote: >>> On 08/28/2014 04:47 PM, Maynard Johnson wrote: >>>> On 08/28/2014 11:11 AM, William Cohen wrote: >>>>> On 08/26/2014 11:56 AM, Maynard Johnson wrote: >>>>>> On 08/15/2014 10:54 AM, Maynard Johnson wrote: >>>>>>> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: >>>>>>> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ >>>>>>> >>>>>>> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. >>>>>> >>>>>> Fellow oprofile architecture maintainers, >>>>>> Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the >>>>>> Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into >>>>>> the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an >>>>>> RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by >>>>>> the end of the week. Thanks! >>>>>> >>>>>> -Maynard >>>>>>> >>>>>>> Thanks. >>>>>>> -Maynard Johnson >>>>> >>>>> Hi Maynard, >>>>> >>>>> I built a rpm using the latest git repo tarball and have been testing that out on a number of different platforms. >>>>> >>>>> There are a number of xml failures in the test results (footnotes (1) and (2)). Have anyone else observed that? Or is this a local build issue? >>>>> >>>>> I also had some issues with the tests running on arm cortex-a9. The perf utility works on the machine but operf and ocount are not. This seems to be specific to the cortex a9 machine things work on a cortex a15 machine. >>>>> >>>>> # operf --events INST_RETIRED:500000:0:1:1,CPU_CYCLES:500000:0:1:1, workloads/thread_src/thread_bin >>>>> perf_event_open failed with Operation not supported >>>>> Caught runtime_error: Internal Error. Perf event setup failed. >>>>> Error running profiler >>>>> >>>> This smells very familiar. Looking at kernel code (arch/arm/kernel/perf_event.c), I see the following: >>>> >>>> /* >>>> * Check whether we need to exclude the counter from certain modes. >>>> */ >>>> if ((!armpmu->set_event_filter || >>>> armpmu->set_event_filter(hwc, &event->attr)) && >>>> event_requires_mode_exclusion(&event->attr)) { >>>> pr_debug("ARM performance counters do not support " >>>> "mode exclusion\n"); >>>> return -EOPNOTSUPP; >>>> } >>>> >>>> I've made changes to oprofile recently so that hypervisor events would be excluded by default, >>>> since we currently have nothing in our display formats to indicate hypervisor samples and nothing >>>> in our event specification (passed to operf and ocount) to either include or exclude hypervisor >>>> events. This bit our s390 friends, and we had to add #ifdef __s390__ in >>>> libperf_events/operf_counter.cpp:221 and pe_counting/ocount_counter.cpp:72. >>>> >>>> *Will C*, can you please try to add '#ifdef __arm__' in the same two places and see if the oprofile >>>> testsuite works then. Perhaps I should back out the change I made to exclude hypervisor events. >>>> The issue is that on systems where mode exclusion *is* supported, and where the system is running >>>> on top of a hypervisor, doing something like 'ocount -e CYCLES:0:0:1' ends up counting *both* userspace >>>> and hypervisor events and the output of ocount does not separate them. A similar issue occurs with operf, >>>> but the problem is not quite so pronounced, since opreport typically cannot be resolve hypervisor sample >>>> addresses symbols. >>>> >>>> >>>> -Maynard >>> >>> I did a git bisection and narrowed the problem patch to: >>> >>> $ git bisect bad >>> 3f93a3b306875ff5591149a23034fed92a0844d7 is the first bad commit >>> commit 3f93a3b306875ff5591149a23034fed92a0844d7 >>> Author: Maynard Johnson <may...@us...> >>> Date: Thu Aug 7 15:58:23 2014 -0500 >>> >>> Exclude collecting hypervisor samples for default event >> >> Yeah, as I suspected. Will, I just posted a patch ("[PATCH] Back out recent change to >> exclude hypervisor samples and counts") to fix this. Can you please try it out. >> Hopefully you can give me feedback on it today. I'd like to put out a release candidate 2 >> today, since I'll be out of the office all next week. >> >> Thanks! >> -Maynard > > Hi Maynard, > > I built rpms from the current oprofile git repo and included the patch (http://koji.fedoraproject.org/koji/taskinfo?taskID=7485311). I reran the tests on the cortex a15, amd family 10, and ivybridge machine and verified there weren't any changes in the test results. For Fedora arm cortex a9 there was some improvements. All the cycle-check-ocount tests pass now and more of the other tests pass. However, there are still more failures on the f20 cortex a9 than the other machine, most often a "Non-zero count for workload ...". I have attached the output of tests for the cortex a9. Will, Thanks for testing the patch. I looked at the testsuite log file you had attached, and all the errors are explainable as testsuite issues or the path of the operf binary messing up how the testsuite locates the opreport.xsd file. To be specific, the testsuite issues are: 1. In the failing oprofile-ocount test, we fiddle with the modes (trying user only, kernel only, and then both -- comparing the "both" output to the cumulative results user-only and kernel-only tests); and since mode exclusion is not supported on that processor, all 3 tests fail. (But why does the testsuite apparently work on A15? Is the non-support of mode exclusion really just for A9?) 2. Using INST_RETIRED on Coretex A9 where that event is not implemented. Soooo . . . I will commit the patch that backs out the hypervisor exclusion, and then will put out a new 1.0.0 release candidate. As I mentioned earlier, I'll be away all next week, so if anyone would like to make some fixes to the testsuite to avoid the two issues listed above, please be my guest. :-) -Maynard > > -Will > |
From: Will D. <wil...@ar...> - 2014-08-29 16:13:00
|
Hi Will, On Fri, Aug 29, 2014 at 05:02:44PM +0100, William Cohen wrote: > I built rpms from the current oprofile git repo and included the patch > (http://koji.fedoraproject.org/koji/taskinfo?taskID=7485311). I reran the > tests on the cortex a15, amd family 10, and ivybridge machine and verified > there weren't any changes in the test results. For Fedora arm cortex a9 > there was some improvements. All the cycle-check-ocount tests pass now > and more of the other tests pass. However, there are still more failures > on the f20 cortex a9 than the other machine, most often a "Non-zero count > for workload ...". I have attached the output of tests for the cortex a9. [...] > Running target unix > Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. > Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. > Using ./config/unix.exp as tool-and-target-specific interface file. > Running ./oprofile-operf/oprofile-operf-run.exp ... > FAIL: nonzero-sized sample file creation: {INST_RETIRED} created nonzero sample files > FAIL: nonzero-sized sample file creation: {INST_RETIRED} created nonzero sample files > FAIL: nonzero-sized sample file creation: {INST_RETIRED} created nonzero sample files > FAIL: nonzero-sized sample file creation: {INST_RETIRED} created nonzero sample files Not sure if it helps, but A9 doesn't implement the INST_RETIRED counter (0x08) so I wouldn't expect any numbers to come back from that event. Will |
From: Andi K. <an...@fi...> - 2014-08-30 01:34:02
|
On Tue, Aug 26, 2014 at 10:56:21AM -0500, Maynard Johnson wrote: > On 08/15/2014 10:54 AM, Maynard Johnson wrote: > > We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: > > http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ > > > > Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. > > Fellow oprofile architecture maintainers, > Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the > Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into > the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an > RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by > the end of the week. Thanks! I did some tests on Ivy Bridge, Haswell, Silvermont and didn't notice any special issues. Minor issues: (may be old ones): - ophelp doesn't seem to always word wrap on more narrow terminals - silvermont is missing a few pebs events -Andi |
From: William C. <wc...@re...> - 2014-09-05 17:29:03
|
On 08/29/2014 12:12 PM, Will Deacon wrote: > Hi Will, > > On Fri, Aug 29, 2014 at 05:02:44PM +0100, William Cohen wrote: >> I built rpms from the current oprofile git repo and included the patch >> (http://koji.fedoraproject.org/koji/taskinfo?taskID=7485311). I reran the >> tests on the cortex a15, amd family 10, and ivybridge machine and verified >> there weren't any changes in the test results. For Fedora arm cortex a9 >> there was some improvements. All the cycle-check-ocount tests pass now >> and more of the other tests pass. However, there are still more failures >> on the f20 cortex a9 than the other machine, most often a "Non-zero count >> for workload ...". I have attached the output of tests for the cortex a9. > > [...] > >> Running target unix >> Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. >> Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. >> Using ./config/unix.exp as tool-and-target-specific interface file. >> Running ./oprofile-operf/oprofile-operf-run.exp ... >> FAIL: nonzero-sized sample file creation: {INST_RETIRED} created nonzero sample files >> FAIL: nonzero-sized sample file creation: {INST_RETIRED} created nonzero sample files >> FAIL: nonzero-sized sample file creation: {INST_RETIRED} created nonzero sample files >> FAIL: nonzero-sized sample file creation: {INST_RETIRED} created nonzero sample files > > Not sure if it helps, but A9 doesn't implement the INST_RETIRED counter > (0x08) so I wouldn't expect any numbers to come back from that event. > > Will > I noticed that the kernel was reporting instruction counts for ARM cortex a9. It looks like it is using the following INS_RENAME, 0x68 instead. "perf stat" reports cpu cycles on cortex a9, but operf on cortex a9 returns the following (it work fine on a cortex a15): [root@dhcp129-63 testsuite]# operf stat ls perf_event_open failed with Operation not supported Caught runtime_error: Internal Error. Perf event setup failed. Error running profiler [root@dhcp129-63 testsuite]# uname -a Linux dhcp129-63.rdu.redhat.com 3.15.10-201.fc20.armv7hl #1 SMP Wed Aug 27 22:16:02 UTC 2014 armv7l armv7l armv7l GNU/Linux [root@dhcp129-63 testsuite]# rpm -q oprofile oprofile-0.9.9-2.fc20.armv7hl The oprofile-0.9.9 had the same problems on cortex a9, so it doesn't appear to be a regression. The oprofile-testsuite seemed to run fine on fedora 20 ARM cortex a15 and fedora 20 x86_64 ivybridge. -Will |
From: Will D. <wil...@ar...> - 2014-09-08 16:28:46
|
Hi Will, On Fri, Sep 05, 2014 at 06:28:24PM +0100, William Cohen wrote: > I noticed that the kernel was reporting instruction counts for ARM cortex > a9. It looks like it is using the following INS_RENAME, 0x68 instead. > "perf stat" reports cpu cycles on cortex a9, but operf on cortex a9 > returns the following (it work fine on a cortex a15): > > [root@dhcp129-63 testsuite]# operf stat ls > perf_event_open failed with Operation not supported > Caught runtime_error: Internal Error. Perf event setup failed. > Error running profiler Strange... it seems to work fine on my A9-based board: will@room-service:~$ operf stat /bin/ls Kernel profiling is not possible with current system config. Set /proc/sys/kernel/kptr_restrict to 0 to collect kernel samples. operf: Profiler started File: ‘/bin/ls’ Size: 71792 Blocks: 144 IO Block: 4096 regular file Device: dh/13d Inode: 28841156 Links: 1 Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2014-09-08 12:41:43.585078000 +0000 Modify: 2014-04-15 06:29:21.000000000 +0000 Change: 2014-06-16 09:53:58.981624000 +0000 Birth: - Profiling done $ opreport Using /home/will/oprofile_data/samples/ for samples directory. CPU: ARM Cortex-A9 Counted CPU_CYCLES events (CPU cycle) with a unit mask of 0x00 (No unit mask) count 100000 CPU_CYCLES:100000| samples| %| ------------------ 104 100.000 stat CPU_CYCLES:100000| samples| %| ------------------ 85 81.7308 no-vmlinux 12 11.5385 ld-2.19.so 7 6.7308 libc-2.19.so $ uname -a Linux room-service 3.16.0 #3 SMP PREEMPT Mon Sep 8 14:05:54 BST 2014 armv7l GNU/Linux Will |
From: Will D. <wil...@ar...> - 2014-09-09 12:32:10
|
On Fri, Sep 05, 2014 at 06:28:24PM +0100, William Cohen wrote: > On 08/29/2014 12:12 PM, Will Deacon wrote: > I noticed that the kernel was reporting instruction counts for ARM cortex > a9. It looks like it is using the following INS_RENAME, 0x68 instead. > "perf stat" reports cpu cycles on cortex a9, but operf on cortex a9 > returns the following (it work fine on a cortex a15): > > > [root@dhcp129-63 testsuite]# operf stat ls > perf_event_open failed with Operation not supported > Caught runtime_error: Internal Error. Perf event setup failed. > Error running profiler Looking back at my reply to you, the command you're running here is a bit weird. You're giving `stat ls' as the program to profile, so if there's nothing in the current directory called `ls' then stat will fail. Not sure if it's related to the operf failure, but it's still odd. Will |
From: Maynard J. <may...@us...> - 2014-09-08 16:45:40
|
On 08/29/2014 08:33 PM, Andi Kleen wrote: > On Tue, Aug 26, 2014 at 10:56:21AM -0500, Maynard Johnson wrote: >> On 08/15/2014 10:54 AM, Maynard Johnson wrote: >>> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: >>> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ >>> >>> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. >> >> Fellow oprofile architecture maintainers, >> Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the >> Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into >> the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an >> RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by >> the end of the week. Thanks! > > I did some tests on Ivy Bridge, Haswell, Silvermont and didn't notice > any special issues. > > Minor issues: (may be old ones): > - ophelp doesn't seem to always word wrap on more narrow terminals > - silvermont is missing a few pebs events Thanks for testing, Andi. If you have a patch to add the missing pebs events for silvermont, I would include it in the GA. -Maynard > > -Andi > |
From: Andi K. <an...@fi...> - 2014-09-08 17:41:14
|
On Mon, Sep 08, 2014 at 11:44:32AM -0500, Maynard Johnson wrote: > On 08/29/2014 08:33 PM, Andi Kleen wrote: > > On Tue, Aug 26, 2014 at 10:56:21AM -0500, Maynard Johnson wrote: > >> On 08/15/2014 10:54 AM, Maynard Johnson wrote: > >>> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: > >>> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ > >>> > >>> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. > >> > >> Fellow oprofile architecture maintainers, > >> Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the > >> Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into > >> the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an > >> RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by > >> the end of the week. Thanks! > > > > I did some tests on Ivy Bridge, Haswell, Silvermont and didn't notice > > any special issues. > > > > Minor issues: (may be old ones): > > - ophelp doesn't seem to always word wrap on more narrow terminals > > - silvermont is missing a few pebs events > Thanks for testing, Andi. If you have a patch to add the missing pebs events for silvermont, I would include it in the GA. I noticed more PEBS flags are missing for other CPUs. Would you be ok with regenerating more of the Intel event lists? This would also give the latest Intel event updates. -Andi |
From: Maynard J. <may...@us...> - 2014-09-08 18:06:03
|
On 09/08/2014 12:41 PM, Andi Kleen wrote: > On Mon, Sep 08, 2014 at 11:44:32AM -0500, Maynard Johnson wrote: >> On 08/29/2014 08:33 PM, Andi Kleen wrote: >>> On Tue, Aug 26, 2014 at 10:56:21AM -0500, Maynard Johnson wrote: >>>> On 08/15/2014 10:54 AM, Maynard Johnson wrote: >>>>> We are pleased to announce OProfile 1.0.0 Release Candidate 1. You can download this release at: >>>>> http://sourceforge.net/projects/oprofile/files/oprofile/oprofile-1.0.0-rc1/ >>>>> >>>>> Please download and test this release candidate, and send your feedback by replying to this message. Please include your hardware platform and Linux distribution information in your reply. >>>> >>>> Fellow oprofile architecture maintainers, >>>> Please test the oprofile 1.0.0 RC1 and let me know your results. FYI . . . Due to the >>>> Java profiling bug reported this week by Brian Hall, I will be incorporating a fix into >>>> the next 1.0.0 tar file I make available. If more fixes are needed, I will put out an >>>> RC2, otherwise I'll make it GA. Either way, I'd like to put out the next tar file by >>>> the end of the week. Thanks! >>> >>> I did some tests on Ivy Bridge, Haswell, Silvermont and didn't notice >>> any special issues. >>> >>> Minor issues: (may be old ones): >>> - ophelp doesn't seem to always word wrap on more narrow terminals >>> - silvermont is missing a few pebs events >> Thanks for testing, Andi. If you have a patch to add the missing pebs events for silvermont, I would include it in the GA. > > I noticed more PEBS flags are missing for other CPUs. Would you be ok with regenerating > more of the Intel event lists? Regenerating the events list for a processor often results in lots of non-functional changes (e.g., grammatical fixes in description text) that make it difficult to visually inspect and validate a patch. If there was a way you could add the PEBS flags with minimal perturbation to existing events list, then, yes, I'd be willing to accept a small-ish patch. Test results using the oprofile testsuite would also be a helpful indicator for the correctness of the patch. -Maynard > > This would also give the latest Intel event updates. > > -Andi > |
From: Will D. <wil...@ar...> - 2014-08-28 18:59:58
|
On Thu, Aug 28, 2014 at 07:24:07PM +0100, William Cohen wrote: > On 08/28/2014 02:07 PM, Will Deacon wrote: > > Thanks for giving this a whirl. Which was the A9 SoC you used? The PMU > > interrupts are often not described properly on those, so it could simply be > > that perf hasn't initialised. > > > The perf utils seem to work on the machine: > > $ perf record ls > ... > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.011 MB perf.data (~502 samples) ] Can you take a look at /proc/interrupts after this run, and see if any of them fired? Also, what does perf report show? > $ perf stat ls > ... > > Performance counter stats for 'ls': > > 6.077000 task-clock (msec) # 0.401 CPUs utilized > 15 context-switches # 0.002 M/sec > 0 cpu-migrations # 0.000 K/sec > 96 page-faults # 0.016 M/sec > 6,003,324 cycles # 0.988 GHz > 1,085,235 stalled-cycles-frontend # 18.08% frontend cycles idle > 4,600,125 stalled-cycles-backend # 76.63% backend cycles idle > 2,117,086 instructions # 0.35 insns per cycle > # 2.17 stalled cycles per insn > 220,666 branches # 36.312 M/sec > 94,593 branch-misses # 42.87% of all branches > > 0.015141693 seconds time elapsed > > > However, the oprofile commands do not: > > $ operf ls > perf_event_open failed with Operation not supported > Caught runtime_error: Internal Error. Perf event setup failed. > Error running profiler Any chance you can dump the perf_event_attr so we can try to establish why the kernel is throwing this back at us? Cheers, Will |
From: William C. <wc...@re...> - 2014-08-28 19:55:26
|
On 08/28/2014 02:59 PM, Will Deacon wrote: > On Thu, Aug 28, 2014 at 07:24:07PM +0100, William Cohen wrote: >> On 08/28/2014 02:07 PM, Will Deacon wrote: >>> Thanks for giving this a whirl. Which was the A9 SoC you used? The PMU >>> interrupts are often not described properly on those, so it could simply be >>> that perf hasn't initialised. >>> Hi Will, Yes, there are cases where dtb doesn't descrbibe the performance monitoring hardware/interrupts properly, but in this case a compulab trimslice with nvidia tegra2 process it appears "perf record ls" seems to work correctly and "perf report" provides sane data. Below is before and after of /proc/interrupts of the "perf record". I suspect that there is some difference in the way that operf and ocount are trying to set up the events when compared to "perf record" and "perf stat. I should use systemtap to look at the various parameters being passed in to set up perf and determine where "perf" and "operf" diverge. -Will [wcohen@localhost ~]$ more /proc/interrupts CPU0 CPU1 29: 105745 1576193 GIC 29 twd 34: 0 0 GIC 34 7000e000.rtc 46: 52 0 GIC 46 mmc0 52: 2646 0 GIC 52 ehci_hcd:usb1 53: 0 0 GIC 53 ehci_hcd:usb2 63: 38523 0 GIC 63 mmc1 68: 1224 0 GIC 68 serial 70: 0 0 GIC 70 7000c000.i2c 71: 3 0 GIC 71 7000c380.spi 73: 0 0 GIC 73 timer0 97: 0 0 GIC 97 host1x_syncpt 116: 0 0 GIC 116 7000c400.i2c 124: 16 0 GIC 124 7000c500.i2c 129: 109 0 GIC 129 ehci_hcd:usb3 130: 1 0 GIC 130 PCIE 131: 2789 0 GIC 131 Tegra PCIe MSI 136: 0 0 GIC 136 apbdma.0 137: 0 0 GIC 137 apbdma.1 138: 0 0 GIC 138 apbdma.2 139: 0 0 GIC 139 apbdma.3 140: 0 0 GIC 140 apbdma.4 141: 0 0 GIC 141 apbdma.5 142: 0 0 GIC 142 apbdma.6 143: 0 0 GIC 143 apbdma.7 144: 0 0 GIC 144 apbdma.8 145: 0 0 GIC 145 apbdma.9 146: 0 0 GIC 146 apbdma.10 147: 0 0 GIC 147 apbdma.11 148: 0 0 GIC 148 apbdma.12 149: 0 0 GIC 149 apbdma.13 150: 0 0 GIC 150 apbdma.14 151: 0 0 GIC 151 apbdma.15 281: 0 0 GPIO 121 c8000600.sdhci cd 350: 0 0 GPIO 190 Power 384: 0 0 Tegra PCIe MSI 0 PCIe PME 385: 2853 0 Tegra PCIe MSI 1 enp1s0 IPI0: 0 0 CPU wakeup interrupts IPI1: 0 0 Timer broadcast interrupts IPI2: 45534 22980 Rescheduling interrupts IPI3: 0 0 Function call interrupts IPI4: 218 159 Single function call interrupts IPI5: 0 0 CPU stop interrupts IPI6: 3630 2862 IRQ work interrupts IPI7: 0 0 completion interrupts Err: 0 [wcohen@localhost ~]$ perf record ls Desktop Downloads Music oprofile_data perf.data.old Public systemtap.sum Videos Documents hosts oprofile perf.data Pictures systemtap.log Templates [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data (~503 samples) ] [wcohen@localhost ~]$ more /proc/interrupts CPU0 CPU1 29: 107913 1607353 GIC 29 twd 34: 0 0 GIC 34 7000e000.rtc 46: 52 0 GIC 46 mmc0 52: 2646 0 GIC 52 ehci_hcd:usb1 53: 0 0 GIC 53 ehci_hcd:usb2 63: 41037 0 GIC 63 mmc1 68: 1224 0 GIC 68 serial 70: 0 0 GIC 70 7000c000.i2c 71: 3 0 GIC 71 7000c380.spi 73: 0 0 GIC 73 timer0 88: 31 0 GIC 88 97: 0 0 GIC 97 host1x_syncpt 116: 0 0 GIC 116 7000c400.i2c 124: 16 0 GIC 124 7000c500.i2c 129: 109 0 GIC 129 ehci_hcd:usb3 130: 1 0 GIC 130 PCIE 131: 3018 0 GIC 131 Tegra PCIe MSI 136: 0 0 GIC 136 apbdma.0 137: 0 0 GIC 137 apbdma.1 138: 0 0 GIC 138 apbdma.2 139: 0 0 GIC 139 apbdma.3 140: 0 0 GIC 140 apbdma.4 141: 0 0 GIC 141 apbdma.5 142: 0 0 GIC 142 apbdma.6 143: 0 0 GIC 143 apbdma.7 144: 0 0 GIC 144 apbdma.8 145: 0 0 GIC 145 apbdma.9 146: 0 0 GIC 146 apbdma.10 147: 0 0 GIC 147 apbdma.11 148: 0 0 GIC 148 apbdma.12 149: 0 0 GIC 149 apbdma.13 150: 0 0 GIC 150 apbdma.14 151: 0 0 GIC 151 apbdma.15 281: 0 0 GPIO 121 c8000600.sdhci cd 350: 0 0 GPIO 190 Power 384: 0 0 Tegra PCIe MSI 0 PCIe PME 385: 3092 0 Tegra PCIe MSI 1 enp1s0 IPI0: 0 0 CPU wakeup interrupts IPI1: 0 0 Timer broadcast interrupts IPI2: 46294 22983 Rescheduling interrupts IPI3: 0 0 Function call interrupts IPI4: 218 159 Single function call interrupts IPI5: 0 0 CPU stop interrupts IPI6: 3760 2862 IRQ work interrupts IPI7: 0 0 completion interrupts Err: 0 [wcohen@localhost ~]$ perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # Samples: 31 of event 'cycles' # Event count (approx.): 6537847 # # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 8.18% ls [kernel.kallsyms] [k] vm_normal_page 7.74% ls [kernel.kallsyms] [k] copy_page 6.18% ls [kernel.kallsyms] [k] handle_mm_fault 6.12% ls [kernel.kallsyms] [k] 0x002d9d60 5.82% ls [kernel.kallsyms] [k] unmapped_area_topdown 5.12% ls [kernel.kallsyms] [k] avc_lookup 4.89% ls [kernel.kallsyms] [k] __memzero 4.50% ls [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 4.38% ls ld-2.18.so [.] _dl_relocate_object 4.30% ls [kernel.kallsyms] [k] __sync_icache_dcache 4.18% ls [kernel.kallsyms] [k] cpu_pj4b_set_pte_ext 4.08% ls [kernel.kallsyms] [k] avtab_search_node 4.04% ls [kernel.kallsyms] [k] strcmp 4.01% ls [kernel.kallsyms] [k] _find_first_bit_le 3.95% ls libc-2.18.so [.] _dl_addr |