From: Maynard J. <may...@us...> - 2008-07-08 16:19:07
|
OProfile 0.9.4 Release Candidate 3 is now available and can be found at: http://sourceforge.net/project/showfiles.php?group_id=16191 The major difference between this release candidate and rc2 (released on June 23) is a fix to handle cross-compile builds. Other changes include: 1) A fix for finding separate debuginfo files when using the --root option; and 2) An updated README_PACKAGERS file giving much more detailed info to RPM packagers regarding changes in the RPM build process that are needed to coincide with new OProfile features. Please test and provide feedback. I hope this release candidate will become our 0.9.4 GA. Thanks. -Maynard Johnson ---------------------------------------------------------------------------------- Release Notes ========== OProfile is a powerful system-wide profiler for Linux. Read more at http://oprofile.sf.net OProfile 0.9.4 has been released. OProfile is still in alpha, but has been proven stable for many users. New features ------------ OProfile now supports profiling Java applications. See Section 1.4 "Installation" in the user manual for instructions on how to build OProfile so that it includes this support. OProfile also includes a framework for adding support for profiling other just-in-time (JIT) compiled languages. A new manual titled "OProfile JIT agent developer guide" is provided to aid developers. Added Xen support for IA64 Added AVR32 support Add '--root' option to opreport which act as a replacement (a prefix) for the / fs Performance improvement for 'opreport --xml --details'. OProfile tends to output obscure error message. Some of these have been fixed. Changes required to build with gcc 4.3 Bug fixes --------- Fix sym_offset calculation bug, seen on 64 bit systems with code mapped with an address >4G Fix opcontrol's calculation of kernel address range to include code found in sections other than just .text Fix opcontrol's short forms of --list-events (-l) and --dump (-d) so they work for non-root users Fix MMCR values and counter-to-event mappings on a few 970MP groups Turn off profiling in hypervisor on 970MP to prevent lost interrupts Update family10 events and unit_masks files to match the BIOS and Kernel Developer's Guide (current, as of May 20, 2008) Since --xml is not compatible with the --sort option, we warned against the use of --sort, but didn't reset the sort options to default. This is now fixed. Fix silent failure of oprof_start for when a counter is missing (e.g, NMI watchdog is up) Change opcontrol to use "-SIG" instead of "-s SIG" since Busybox's implementation of "kill" doesn't understand the "-s SIG" option Fix 'opannotate -s' to work with inlined code Update POWER6 event files (add new event groupings; make some fixes to others) Fix "Dangling ESCAPE CODE" error that can occur on Cell BE SPE profiling Fix opreport for Cell BE SPE profiles to attribute samples to dynamically generated call stubs executed from SPE stack (for example, I/O calls to the libc library) Fix loop in 'opcontrol --dump' code when using --session-dir on a network drive (clock issues) Fix problem with 'oparchive --list-files' that can occur if file doesn't exist Fix user/kernel domain profiling switches for ppc64 architectures Fix the bfd_get_synthetic_symtab check in m4 macro to work correctly with '--with-binutils' configure option Fix ARM big-endian syscall bug #1820202, differential profile broken in 0.9.3, is fixed bug #1717298, mips events have incorrect id numbers, is fixed, all event number > 9 was incorrectly setup bug #1564920, opcontrol does not check if objdump exists is fixed, we error out and pinpoint the right error now bug #1819350, oparchive doesn't work with kernel module, is fixed (need another related fix, if you read that I forget about that, ping me please -- phe) bug #1828566, --xml and -t and callgraph now output symboldata for all caller/callees bug #1930788, opreport error: basic_string::erase Allow performance counter allocation to work even when some performance counters are already reserved (e.g. by nmi watchdog) Known problems -------------- When using callgraph profiling, it's possible that invalid sample files are created (bug #1685267). Many Alpha ev67 events do not work (bug #931875). A few Pentium IV events are not supported (bug #841099). For 2.2 kernels, the module must be compiled as the same user that owns the kernel source tree. With an AMD64 kernel, OProfile must be built in 64 bit mode due to lack of kernel support. opreport -c gives strange output for binaries without symbols. Callgraph output for the new JIT support is incorrect. See Chapter 4, section 2.3.2 of OProfile user manual. When using the JVMPI agent to profile Java code, the following error is printed by the agent if the JVM moves objects in the heap: "Error: Cannot find class for compiled method" Workaround: If your JVM is 1.5, then use JVMTI agent instead of JVMPI agent. If your JVM is pre-1.5, see oprofile-list for a patch. |
From: Maynard J. <may...@us...> - 2008-07-09 15:39:40
|
Maynard Johnson wrote: > OProfile 0.9.4 Release Candidate 3 is now available and can be found at: > > http://sourceforge.net/project/showfiles.php?group_id=16191 > > The major difference between this release candidate and rc2 (released on > June 23) is a fix to handle cross-compile builds. Other changes > include: 1) A fix for finding separate debuginfo files when using the > --root option; and 2) An updated README_PACKAGERS file giving much more > detailed info to RPM packagers regarding changes in the RPM build > process that are needed to coincide with new OProfile features. > > Please test and provide feedback. I hope this release candidate will > become our 0.9.4 GA. I got positive test feedback for two platforms: POWER5 and Intel Xeon x86_64, including JVMPI and JVMTI testing. Test feedback from other platforms would be appreciated. Thanks. -Maynard > > Thanks. > -Maynard Johnson > > ---------------------------------------------------------------------------------- > > > Release Notes > ========== > > OProfile is a powerful system-wide profiler for Linux. Read > more at http://oprofile.sf.net > > OProfile 0.9.4 has been released. OProfile is still in alpha, > but has been proven stable for many users. > > New features > ------------ > > OProfile now supports profiling Java applications. See Section 1.4 > "Installation" in the user manual for instructions on how to build > OProfile so that it includes this support. > > OProfile also includes a framework for adding support for profiling > other just-in-time (JIT) compiled languages. A new manual titled > "OProfile JIT agent developer guide" is provided to aid developers. > > Added Xen support for IA64 > > Added AVR32 support > > Add '--root' option to opreport which act as a replacement (a prefix) > for the / fs > > Performance improvement for 'opreport --xml --details'. > > OProfile tends to output obscure error message. Some of these have been > fixed. > > Changes required to build with gcc 4.3 > > Bug fixes > --------- > > Fix sym_offset calculation bug, seen on 64 bit systems with code mapped > with an address >4G > > Fix opcontrol's calculation of kernel address range to include code found > in sections other than just .text > > Fix opcontrol's short forms of --list-events (-l) and --dump (-d) so they > work for non-root users > > Fix MMCR values and counter-to-event mappings on a few 970MP groups > > Turn off profiling in hypervisor on 970MP to prevent lost interrupts > > Update family10 events and unit_masks files to match the BIOS and Kernel > Developer's Guide (current, as of May 20, 2008) > > Since --xml is not compatible with the --sort option, we warned against > the use of --sort, but didn't reset the sort options to default. This is > now fixed. > > Fix silent failure of oprof_start for when a counter is missing (e.g, NMI > watchdog is up) > > Change opcontrol to use "-SIG" instead of "-s SIG" since Busybox's > implementation of "kill" doesn't understand the "-s SIG" option > > Fix 'opannotate -s' to work with inlined code > > Update POWER6 event files (add new event groupings; make some fixes to > others) > > Fix "Dangling ESCAPE CODE" error that can occur on Cell BE SPE profiling > > Fix opreport for Cell BE SPE profiles to attribute samples to dynamically > generated call stubs executed from SPE stack (for example, I/O calls to > the libc library) > > Fix loop in 'opcontrol --dump' code when using --session-dir on a network > drive (clock issues) > > Fix problem with 'oparchive --list-files' that can occur if file doesn't > exist > > Fix user/kernel domain profiling switches for ppc64 architectures > > Fix the bfd_get_synthetic_symtab check in m4 macro to work correctly with > '--with-binutils' configure option > > Fix ARM big-endian syscall > > bug #1820202, differential profile broken in 0.9.3, is fixed > > bug #1717298, mips events have incorrect id numbers, is fixed, all event > number > 9 was incorrectly setup > > bug #1564920, opcontrol does not check if objdump exists is fixed, we error > out and pinpoint the right error now > > bug #1819350, oparchive doesn't work with kernel module, is fixed (need > another related fix, if you read that I forget about that, ping me > please -- phe) > > bug #1828566, --xml and -t and callgraph now output symboldata for all > caller/callees > > bug #1930788, opreport error: basic_string::erase > > Allow performance counter allocation to work even when some performance > counters are already reserved (e.g. by nmi watchdog) > > > Known problems > -------------- > > When using callgraph profiling, it's possible that invalid sample > files are created (bug #1685267). > > Many Alpha ev67 events do not work (bug #931875). > > A few Pentium IV events are not supported (bug #841099). > > For 2.2 kernels, the module must be compiled as the same user > that owns the kernel source tree. > > With an AMD64 kernel, OProfile must be built in 64 bit mode due to lack > of kernel support. > > opreport -c gives strange output for binaries without symbols. > > Callgraph output for the new JIT support is incorrect. See Chapter 4, > section 2.3.2 of OProfile user manual. > > When using the JVMPI agent to profile Java code, the following error > is printed by the agent if the JVM moves objects in the heap: > "Error: Cannot find class for compiled method" > Workaround: If your JVM is 1.5, then use JVMTI agent instead of JVMPI agent. > If your JVM is pre-1.5, see oprofile-list for a patch. > > > > ------------------------------------------------------------------------- > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! > Studies have shown that voting for your favorite open source project, > along with a healthy diet, reduces your potential for chronic lameness > and boredom. Vote Now at http://www.sourceforge.net/community/cca08 > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list |
From: M. E. (E. B. <zn...@ce...> - 2008-07-10 03:38:34
Attachments:
znmeb.vcf
|
Maynard Johnson wrote: > Maynard Johnson wrote: >> OProfile 0.9.4 Release Candidate 3 is now available and can be found at: >> >> http://sourceforge.net/project/showfiles.php?group_id=16191 >> >> The major difference between this release candidate and rc2 (released on >> June 23) is a fix to handle cross-compile builds. Other changes >> include: 1) A fix for finding separate debuginfo files when using the >> --root option; and 2) An updated README_PACKAGERS file giving much more >> detailed info to RPM packagers regarding changes in the RPM build >> process that are needed to coincide with new OProfile features. >> >> Please test and provide feedback. I hope this release candidate will >> become our 0.9.4 GA. > I got positive test feedback for two platforms: POWER5 and Intel Xeon x86_64, > including JVMPI and JVMTI testing. Test feedback from other platforms would be > appreciated. > > Thanks. > -Maynard >> Thanks. >> -Maynard Johnson It appears to be working fine on 64-bit Gentoo Linux 2.6.25-r6 on an Athlon64 X2. I haven't had a chance to dig into the JVM stuff yet, but I probably will once I've had a chance to acquire some test cases. -- M. Edward (Ed) Borasky http://ruby-perspectives.blogspot.com/ "A mathematician is a machine for turning coffee into theorems." -- Alfréd Rényi via Paul Erdős |
From: Maynard J. <may...@us...> - 2008-07-10 12:45:18
|
Joerg Wagner wrote: > Hello Maynard, > > I tested oprofile 0.9.4 on an ARM MPCore platform (RealViewEB) > and it works for my daily use cases (native code profiling > using cycles and cache miss events) as well as 0.9.3 > Sometimes I only get exactly one sample but that problem > also existed in 0.9.3. I have not been able to investigate > this or even pin it down, as it is occuring only sporadically. > > So no regressions regarding these small tests compared to 0.9.3. > Thanks for testing, Joerg. (cc'ing oprofile-list for community awareness.) -Maynard > Regards! > > Joerg > > Maynard Johnson wrote: > >> Maynard Johnson wrote: >> > [...] > >> Test feedback from other platforms would be appreciated. >> >> Thanks. >> -Maynard >> >>> Thanks. >>> -Maynard Johnson >>> >>> > > |
From: Daniel H. <dan...@li...> - 2008-07-14 15:49:51
|
Hi Maynard, I have tested oprofile 0.9.4 RC3 on s390x platform (64bit) using IBM JDK 1.5.0_07 and SUN JDK 1.5.0_14. My tests were finished successfully. Great job so far. Kind regards, Daniel Maynard Johnson wrote: > OProfile 0.9.4 Release Candidate 3 is now available and can be found at: > > http://sourceforge.net/project/showfiles.php?group_id=16191 > > The major difference between this release candidate and rc2 (released on > June 23) is a fix to handle cross-compile builds. Other changes > include: 1) A fix for finding separate debuginfo files when using the > --root option; and 2) An updated README_PACKAGERS file giving much more > detailed info to RPM packagers regarding changes in the RPM build > process that are needed to coincide with new OProfile features. > > Please test and provide feedback. I hope this release candidate will > become our 0.9.4 GA. > > Thanks. > -Maynard Johnson > > ---------------------------------------------------------------------------------- > > > Release Notes > ========== > > OProfile is a powerful system-wide profiler for Linux. Read > more at http://oprofile.sf.net > > OProfile 0.9.4 has been released. OProfile is still in alpha, > but has been proven stable for many users. > > New features > ------------ > > OProfile now supports profiling Java applications. See Section 1.4 > "Installation" in the user manual for instructions on how to build > OProfile so that it includes this support. > > OProfile also includes a framework for adding support for profiling > other just-in-time (JIT) compiled languages. A new manual titled > "OProfile JIT agent developer guide" is provided to aid developers. > > Added Xen support for IA64 > > Added AVR32 support > > Add '--root' option to opreport which act as a replacement (a prefix) > for the / fs > > Performance improvement for 'opreport --xml --details'. > > OProfile tends to output obscure error message. Some of these have been > fixed. > > Changes required to build with gcc 4.3 > > Bug fixes > --------- > > Fix sym_offset calculation bug, seen on 64 bit systems with code mapped > with an address >4G > > Fix opcontrol's calculation of kernel address range to include code found > in sections other than just .text > > Fix opcontrol's short forms of --list-events (-l) and --dump (-d) so they > work for non-root users > > Fix MMCR values and counter-to-event mappings on a few 970MP groups > > Turn off profiling in hypervisor on 970MP to prevent lost interrupts > > Update family10 events and unit_masks files to match the BIOS and Kernel > Developer's Guide (current, as of May 20, 2008) > > Since --xml is not compatible with the --sort option, we warned against > the use of --sort, but didn't reset the sort options to default. This is > now fixed. > > Fix silent failure of oprof_start for when a counter is missing (e.g, NMI > watchdog is up) > > Change opcontrol to use "-SIG" instead of "-s SIG" since Busybox's > implementation of "kill" doesn't understand the "-s SIG" option > > Fix 'opannotate -s' to work with inlined code > > Update POWER6 event files (add new event groupings; make some fixes to > others) > > Fix "Dangling ESCAPE CODE" error that can occur on Cell BE SPE profiling > > Fix opreport for Cell BE SPE profiles to attribute samples to dynamically > generated call stubs executed from SPE stack (for example, I/O calls to > the libc library) > > Fix loop in 'opcontrol --dump' code when using --session-dir on a network > drive (clock issues) > > Fix problem with 'oparchive --list-files' that can occur if file doesn't > exist > > Fix user/kernel domain profiling switches for ppc64 architectures > > Fix the bfd_get_synthetic_symtab check in m4 macro to work correctly with > '--with-binutils' configure option > > Fix ARM big-endian syscall > > bug #1820202, differential profile broken in 0.9.3, is fixed > > bug #1717298, mips events have incorrect id numbers, is fixed, all event > number > 9 was incorrectly setup > > bug #1564920, opcontrol does not check if objdump exists is fixed, we error > out and pinpoint the right error now > > bug #1819350, oparchive doesn't work with kernel module, is fixed (need > another related fix, if you read that I forget about that, ping me > please -- phe) > > bug #1828566, --xml and -t and callgraph now output symboldata for all > caller/callees > > bug #1930788, opreport error: basic_string::erase > > Allow performance counter allocation to work even when some performance > counters are already reserved (e.g. by nmi watchdog) > > > Known problems > -------------- > > When using callgraph profiling, it's possible that invalid sample > files are created (bug #1685267). > > Many Alpha ev67 events do not work (bug #931875). > > A few Pentium IV events are not supported (bug #841099). > > For 2.2 kernels, the module must be compiled as the same user > that owns the kernel source tree. > > With an AMD64 kernel, OProfile must be built in 64 bit mode due to lack > of kernel support. > > opreport -c gives strange output for binaries without symbols. > > Callgraph output for the new JIT support is incorrect. See Chapter 4, > section 2.3.2 of OProfile user manual. > > When using the JVMPI agent to profile Java code, the following error > is printed by the agent if the JVM moves objects in the heap: > "Error: Cannot find class for compiled method" > Workaround: If your JVM is 1.5, then use JVMTI agent instead of JVMPI agent. > If your JVM is pre-1.5, see oprofile-list for a patch. > > > > ------------------------------------------------------------------------- > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! > Studies have shown that voting for your favorite open source project, > along with a healthy diet, reduces your potential for chronic lameness > and boredom. Vote Now at http://www.sourceforge.net/community/cca08 > _______________________________________________ > oprofile-list mailing list > opr...@li... > https://lists.sourceforge.net/lists/listinfo/oprofile-list |
From: William C. <wc...@re...> - 2008-07-14 22:31:30
|
Maynard Johnson wrote: > OProfile 0.9.4 Release Candidate 3 is now available and can be found at: > > http://sourceforge.net/project/showfiles.php?group_id=16191 > > The major difference between this release candidate and rc2 (released on > June 23) is a fix to handle cross-compile builds. Other changes > include: 1) A fix for finding separate debuginfo files when using the > --root option; and 2) An updated README_PACKAGERS file giving much more > detailed info to RPM packagers regarding changes in the RPM build > process that are needed to coincide with new OProfile features. > > Please test and provide feedback. I hope this release candidate will > become our 0.9.4 GA. > > Thanks. > -Maynard Johnson Initially got errors when attempting to build 0.9.4-rc3 on F9 of the form: /usr/bin/ld: /usr/lib/gcc/x86_64-redhat-linux/4.3.0/../../../../lib64/libbfd.a(archures.o): relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC /usr/lib/gcc/x86_64-redhat-linux/4.3.0/../../../../lib64/libbfd.a: could not read symbols: Bad value collect2: ld returned 1 exit status This appeared to be due the binutils RPM being built without -fPIC on F-9 (and on RHEL 5): https://bugzilla.redhat.com/show_bug.cgi?id=447426 libbfd binutils -fPIC Rebuild binutils RPMs with the fix suggested on: https://fcp.surfsite.org/modules/newbb/viewtopic.php?topic_id=53684&forum=11 I was able to get oprofile RPM built, ran the simple dejagnu runtest smoke tests on F-9 x86_64 machine, and all the tests passed. On RHEL-5 x86-64, i386, and ia64, built binutils with -fPIC (and installing libtool RPM), then built oprofile without the java support and ran the tests. The smoke tests worked okay on the i386 and x86_64. Need to take a closer look at what is going on with the smoke tests on ia64 (RHEL oprofile-0.9.3-16.el5version worked much better); the underlying pfmon appears to work on the ia64 machine, but the new oprofile is getting the following error in ia64 machine: Couldn't allocate hardware counters for the selected events. The detailed cpu types used for the tests. Distro/arch /dev/oprofile/cpu_type -------------------------------------- F-9 x86_64 i386/core_2 RHEL 5 x86_64 i386/p4-ht RHEL 5 i386 timer RHEL 5 ia64 ia64/itanium2 -Will |
From: Maynard J. <may...@us...> - 2008-07-15 16:21:33
|
William Cohen wrote: > Maynard Johnson wrote: >> OProfile 0.9.4 Release Candidate 3 is now available and can be found at: >> >> http://sourceforge.net/project/showfiles.php?group_id=16191 >> >> The major difference between this release candidate and rc2 (released on >> June 23) is a fix to handle cross-compile builds. Other changes >> include: 1) A fix for finding separate debuginfo files when using the >> --root option; and 2) An updated README_PACKAGERS file giving much more >> detailed info to RPM packagers regarding changes in the RPM build >> process that are needed to coincide with new OProfile features. >> >> Please test and provide feedback. I hope this release candidate will >> become our 0.9.4 GA. >> >> Thanks. >> -Maynard Johnson > > Initially got errors when attempting to build 0.9.4-rc3 on F9 of the form: > > /usr/bin/ld: > /usr/lib/gcc/x86_64-redhat-linux/4.3.0/../../../../lib64/libbfd.a(archures.o): > relocation R_X86_64_32 against `a local symbol' can not be used when > making a shared object; recompile with -fPIC > /usr/lib/gcc/x86_64-redhat-linux/4.3.0/../../../../lib64/libbfd.a: could > not read symbols: Bad value > collect2: ld returned 1 exit status > > This appeared to be due the binutils RPM being built without -fPIC on F-9 > (and on RHEL 5): I presume oprofile 0.9.3 would also fail to build under these conditions, right? > > https://bugzilla.redhat.com/show_bug.cgi?id=447426 libbfd binutils -fPIC > > Rebuild binutils RPMs with the fix suggested on: > > https://fcp.surfsite.org/modules/newbb/viewtopic.php?topic_id=53684&forum=11 > > > I was able to get oprofile RPM built, ran the simple dejagnu runtest smoke > tests on F-9 x86_64 machine, and all the tests passed. > > On RHEL-5 x86-64, i386, and ia64, built binutils with -fPIC (and installing > libtool RPM), then built oprofile without the java support and ran the > tests. The smoke tests worked okay on the i386 and x86_64. Need to > take a closer look at what is going on with the smoke tests on ia64 > (RHEL oprofile-0.9.3-16.el5version worked much better); the underlying > pfmon appears to work on the ia64 machine, but the new oprofile is > getting the > following error in ia64 machine: > > Couldn't allocate hardware counters for the selected events. Curious, especially considering the recently applied patch to op_alloc_counter.c So far, this is the only issue raised against rc3. I had hoped we could GA 0.9.4 by the end of the week, but I'll hold off until I hear from you about the underlying cause of this problem. Thanks. -Maynard > > > The detailed cpu types used for the tests. > > Distro/arch /dev/oprofile/cpu_type > -------------------------------------- > F-9 x86_64 i386/core_2 > RHEL 5 x86_64 i386/p4-ht > RHEL 5 i386 timer > RHEL 5 ia64 ia64/itanium2 > > > -Will |
From: William C. <wc...@re...> - 2008-07-15 20:06:08
Attachments:
oprofile-0.9.4-allocate.patch
|
Maynard Johnson wrote: > William Cohen wrote: >> Maynard Johnson wrote: >>> OProfile 0.9.4 Release Candidate 3 is now available and can be found at: >>> >>> http://sourceforge.net/project/showfiles.php?group_id=16191 >>> >>> The major difference between this release candidate and rc2 (released on >>> June 23) is a fix to handle cross-compile builds. Other changes >>> include: 1) A fix for finding separate debuginfo files when using the >>> --root option; and 2) An updated README_PACKAGERS file giving much more >>> detailed info to RPM packagers regarding changes in the RPM build >>> process that are needed to coincide with new OProfile features. >>> >>> Please test and provide feedback. I hope this release candidate will >>> become our 0.9.4 GA. >>> >>> Thanks. >>> -Maynard Johnson >> >> Initially got errors when attempting to build 0.9.4-rc3 on F9 of the >> form: >> >> /usr/bin/ld: >> /usr/lib/gcc/x86_64-redhat-linux/4.3.0/../../../../lib64/libbfd.a(archures.o): >> relocation R_X86_64_32 against `a local symbol' can not be used when >> making a shared object; recompile with -fPIC >> /usr/lib/gcc/x86_64-redhat-linux/4.3.0/../../../../lib64/libbfd.a: >> could not read symbols: Bad value >> collect2: ld returned 1 exit status >> >> This appeared to be due the binutils RPM being built without -fPIC on F-9 >> (and on RHEL 5): > I presume oprofile 0.9.3 would also fail to build under these > conditions, right? >> >> https://bugzilla.redhat.com/show_bug.cgi?id=447426 libbfd binutils -fPIC >> >> Rebuild binutils RPMs with the fix suggested on: >> >> https://fcp.surfsite.org/modules/newbb/viewtopic.php?topic_id=53684&forum=11 >> >> >> I was able to get oprofile RPM built, ran the simple dejagnu runtest >> smoke >> tests on F-9 x86_64 machine, and all the tests passed. >> >> On RHEL-5 x86-64, i386, and ia64, built binutils with -fPIC (and >> installing >> libtool RPM), then built oprofile without the java support and ran the >> tests. The smoke tests worked okay on the i386 and x86_64. Need to >> take a closer look at what is going on with the smoke tests on ia64 >> (RHEL oprofile-0.9.3-16.el5version worked much better); the underlying >> pfmon appears to work on the ia64 machine, but the new oprofile is >> getting the >> following error in ia64 machine: >> >> Couldn't allocate hardware counters for the selected events. > Curious, especially considering the recently applied patch to > op_alloc_counter.c > > So far, this is the only issue raised against rc3. I had hoped we could > GA 0.9.4 by the end of the week, but I'll hold off until I hear from you > about the underlying cause of this problem. > > Thanks. > -Maynard >> >> >> The detailed cpu types used for the tests. >> >> Distro/arch /dev/oprofile/cpu_type >> -------------------------------------- >> F-9 x86_64 i386/core_2 >> RHEL 5 x86_64 i386/p4-ht >> RHEL 5 i386 timer >> RHEL 5 ia64 ia64/itanium2 >> >> >> -Will Hi Maynard, I took a closer look at what is going on. You were right that the op_alloc_counter.c patch was causing a problem. The ia64 is unusual in that it does the actual setup of performance monitoring hardware through perfmon. /dev/oprofile doesn't have any performance register directories, so op_get_counter_mask returns 0. I have attached a proposed patch that assumed that perfmon is managing the counters and looks up the number if there are no counters directories available. This isn't ideal. Assumes perfmon managing and all the registers are available if there are 0 counters. I tried out the patched oprofile on ia64, x86_64, and i386. Things work better with the patch. -Will |
From: Maynard J. <may...@us...> - 2008-07-15 21:30:37
|
William Cohen wrote: > Maynard Johnson wrote: >> William Cohen wrote: >>> Maynard Johnson wrote: >>>> OProfile 0.9.4 Release Candidate 3 is now available and can be found >>>> at: >>>> >>>> http://sourceforge.net/project/showfiles.php?group_id=16191 >>>> [snip] >>> On RHEL-5 x86-64, i386, and ia64, built binutils with -fPIC (and >>> installing >>> libtool RPM), then built oprofile without the java support and ran the >>> tests. The smoke tests worked okay on the i386 and x86_64. Need to >>> take a closer look at what is going on with the smoke tests on ia64 >>> (RHEL oprofile-0.9.3-16.el5version worked much better); the underlying >>> pfmon appears to work on the ia64 machine, but the new oprofile is >>> getting the >>> following error in ia64 machine: >>> >>> Couldn't allocate hardware counters for the selected events. >> Curious, especially considering the recently applied patch to >> op_alloc_counter.c >> >> So far, this is the only issue raised against rc3. I had hoped we >> could GA 0.9.4 by the end of the week, but I'll hold off until I hear >> from you about the underlying cause of this problem. >> >> Thanks. >> -Maynard >>> >>> >>> The detailed cpu types used for the tests. >>> >>> Distro/arch /dev/oprofile/cpu_type >>> -------------------------------------- >>> F-9 x86_64 i386/core_2 >>> RHEL 5 x86_64 i386/p4-ht >>> RHEL 5 i386 timer >>> RHEL 5 ia64 ia64/itanium2 >>> >>> >>> -Will > > Hi Maynard, > > I took a closer look at what is going on. You were right that the > op_alloc_counter.c patch was causing a problem. The ia64 is unusual in > that it does the actual setup of performance monitoring hardware through > perfmon. /dev/oprofile doesn't have any performance register > directories, so op_get_counter_mask returns 0. I have attached a > proposed patch that assumed that perfmon is managing the counters and > looks up the number if there are no counters directories available. This > isn't ideal. Assumes perfmon managing and all the registers are > available if there are 0 counters. > > I tried out the patched oprofile on ia64, x86_64, and i386. Things work > better with the patch. So at some point in the past, it was determined that the ia64 oprofile kernel driver would not create the counter directories in /dev/oprofile. In opcontrol:set_ctr_param(), I see a test for "IS_PERFMON", resulting in an immediate exit if true. Given the ia64 kerenel driver's current behavior and how opcontrol has been modified to adjust to it, your patch seems as good as we can get. But it seems to me this leaves open the very possibility your previous patch was trying to prevent -- i.e., that something like the watchdog timer holding could have allocated a PMU counter and oprofile has no way of knowing about it. *Stephane*, can you provide any guidance on this? Thanks. -Maynard > > -Will > |
From: stephane e. <er...@go...> - 2008-07-15 22:09:45
|
Maynard, I don't have all the context, incl. the patch. But yes, on IA64 Oprofile uses perfmon to program the registers and trigger counter overflows. Using customizable sampling buffer format mechanism, we can connect the Oprofile sample recording routine to perfmon with only a few lines of code. The buffer setup and export remains the same. If I understand your question. You are asking how would Oprofile know if not all counters are available? Well, on IA-64, there is no watchdog timer stealing counters. But this is not a good answer, I'll grant you that. In perfmon v2.x (x>2), there is a way for an application to query the registers it has available using pfm_getinfo_evtsets(). This is used on x86 to run in degraded mode, if watchdog is active for instance. But the mechanism is generic enough it can apply in other situations. In the current production IA-64 systems, I think you can look at /proc/perfmon and it does tell you what is available if I recall. Note also that if the application tries to write to an unavailable register, it will get an error, so that is also another way, though painful, to scan for what's there. Hope this helps. On Tue, Jul 15, 2008 at 11:01 PM, Maynard Johnson <may...@us...> wrote: > William Cohen wrote: >> >> Maynard Johnson wrote: >>> >>> William Cohen wrote: >>>> >>>> Maynard Johnson wrote: >>>>> >>>>> OProfile 0.9.4 Release Candidate 3 is now available and can be found >>>>> at: >>>>> >>>>> http://sourceforge.net/project/showfiles.php?group_id=16191 >>>>> > [snip] >>>> >>>> On RHEL-5 x86-64, i386, and ia64, built binutils with -fPIC (and >>>> installing >>>> libtool RPM), then built oprofile without the java support and ran the >>>> tests. The smoke tests worked okay on the i386 and x86_64. Need to >>>> take a closer look at what is going on with the smoke tests on ia64 >>>> (RHEL oprofile-0.9.3-16.el5version worked much better); the underlying >>>> pfmon appears to work on the ia64 machine, but the new oprofile is >>>> getting the >>>> following error in ia64 machine: >>>> >>>> Couldn't allocate hardware counters for the selected events. >>> >>> Curious, especially considering the recently applied patch to >>> op_alloc_counter.c >>> >>> So far, this is the only issue raised against rc3. I had hoped we could >>> GA 0.9.4 by the end of the week, but I'll hold off until I hear from you >>> about the underlying cause of this problem. >>> >>> Thanks. >>> -Maynard >>>> >>>> >>>> The detailed cpu types used for the tests. >>>> >>>> Distro/arch /dev/oprofile/cpu_type >>>> -------------------------------------- >>>> F-9 x86_64 i386/core_2 >>>> RHEL 5 x86_64 i386/p4-ht >>>> RHEL 5 i386 timer >>>> RHEL 5 ia64 ia64/itanium2 >>>> >>>> >>>> -Will >> >> Hi Maynard, >> >> I took a closer look at what is going on. You were right that the >> op_alloc_counter.c patch was causing a problem. The ia64 is unusual in that >> it does the actual setup of performance monitoring hardware through perfmon. >> /dev/oprofile doesn't have any performance register directories, so >> op_get_counter_mask returns 0. I have attached a proposed patch that >> assumed that perfmon is managing the counters and looks up the number if >> there are no counters directories available. This isn't ideal. Assumes >> perfmon managing and all the registers are available if there are 0 >> counters. >> >> I tried out the patched oprofile on ia64, x86_64, and i386. Things work >> better with the patch. > > So at some point in the past, it was determined that the ia64 oprofile > kernel driver would not create the counter directories in /dev/oprofile. In > opcontrol:set_ctr_param(), I see a test for "IS_PERFMON", resulting in an > immediate exit if true. Given the ia64 kerenel driver's current behavior > and how opcontrol has been modified to adjust to it, your patch seems as > good as we can get. But it seems to me this leaves open the very > possibility your previous patch was trying to prevent -- i.e., that > something like the watchdog timer holding could have allocated a PMU counter > and oprofile has no way of knowing about it. > > *Stephane*, can you provide any guidance on this? > > Thanks. > -Maynard >> >> -Will >> > > > |
From: William C. <wc...@re...> - 2008-07-16 16:03:51
|
stephane eranian wrote: > Maynard, > > I don't have all the context, incl. the patch. But yes, on IA64 > Oprofile uses perfmon to program the registers and trigger counter > overflows. > Using customizable sampling buffer format mechanism, we can connect > the Oprofile sample recording routine to perfmon with only > a few lines of code. The buffer setup and export remains the same. > > If I understand your question. You are asking how would Oprofile know > if not all counters are available? > Well, on IA-64, there is no watchdog timer stealing counters. But this > is not a good answer, I'll grant you that. > In perfmon v2.x (x>2), there is a way for an application to query the > registers it has available using pfm_getinfo_evtsets(). > This is used on x86 to run in degraded mode, if watchdog is active for > instance. But the mechanism is generic enough > it can apply in other situations. > > In the current production IA-64 systems, I think you can look at > /proc/perfmon and it does tell you what is available if I recall. > Note also that if the application tries to write to an unavailable > register, it will get an error, so that is also another way, though > painful, to scan for what's there. > > Hope this helps. Hi Stephane, Thanks for the comments. The main concern is to make sure even when perfmon is being used to program the performance monitoring hardware that oprofile has correct knowledge on what portion of the performance monitoring hardware is available. Is there an existing example that show pfm_getinfo_evtsets() being used to query for the available registers? This sounds like the more promising approach. I took a look at /proc/perfmon on ia64 running RHEL-5 (perfmon). The additional pmd and pmc/pmd register information is only printed out if /proc/sys/kernel/perfmon/debug is set to 1. There is a similar directory on perfmon2 /sys/kernel/perfmon/pmu_desc/, but it doesn't appear to be affected by debugging flag. Noticed that the daemon/opd_perfmon.c code implicitly adds 4 to the address in write_pmu() function, so the registers won't line up. This would be an issue for events that need to be in specific counters, e.g. DATA_REFERENCES_SET0. -Will |
From: stephane e. <er...@go...> - 2008-07-17 20:18:35
|
Will, On Wed, Jul 16, 2008 at 5:25 PM, William Cohen <wc...@re...> wrote: > Thanks for the comments. The main concern is to make sure even when perfmon > is being used to program the performance monitoring hardware that oprofile > has correct knowledge on what portion of the performance monitoring hardware > is available. > > Is there an existing example that show pfm_getinfo_evtsets() being used to > query for the available registers? This sounds like the more promising > approach. > Yes, in libpfm/examples/detect_pmcs.c > I took a look at /proc/perfmon on ia64 running RHEL-5 (perfmon). The > additional pmd and pmc/pmd register information is only printed out if > /proc/sys/kernel/perfmon/debug is set to 1. > There is a similar directory on perfmon2 /sys/kernel/perfmon/pmu_desc/, but > it doesn't appear to be affected by debugging flag. > the difficulty is that with current mainline perfmon v2.0 for IA-64, the pfm_getinfo_evtsets() does not exists. It was introduced with event sets and multiplexing. So I think your only choice is to use the try and fail approach of writing each PMU register. > Noticed that the daemon/opd_perfmon.c code implicitly adds 4 to the address > in write_pmu() function, so the registers won't line up. This would be an > issue for events that need to be in specific counters, e.g. > DATA_REFERENCES_SET0. > Why won't they align? Aligned compared to what Oprofile uses? |
From: William C. <wc...@re...> - 2008-07-17 20:49:39
|
stephane eranian wrote: > Will, > > On Wed, Jul 16, 2008 at 5:25 PM, William Cohen <wc...@re...> wrote: >> Thanks for the comments. The main concern is to make sure even when perfmon >> is being used to program the performance monitoring hardware that oprofile >> has correct knowledge on what portion of the performance monitoring hardware >> is available. >> >> Is there an existing example that show pfm_getinfo_evtsets() being used to >> query for the available registers? This sounds like the more promising >> approach. >> > Yes, in libpfm/examples/detect_pmcs.c Hi Stephane, thanks for the pointer to the example. >> I took a look at /proc/perfmon on ia64 running RHEL-5 (perfmon). The >> additional pmd and pmc/pmd register information is only printed out if >> /proc/sys/kernel/perfmon/debug is set to 1. >> There is a similar directory on perfmon2 /sys/kernel/perfmon/pmu_desc/, but >> it doesn't appear to be affected by debugging flag. >> > the difficulty is that with current mainline perfmon v2.0 for IA-64, > the pfm_getinfo_evtsets() > does not exists. It was introduced with event sets and multiplexing. > So I think your only > choice is to use the try and fail approach of writing each PMU register. So there would need to be some autoconf check to determine that pfm_getinfo_evtsets() exists on the machine before the package is built if that approach is taken. >> Noticed that the daemon/opd_perfmon.c code implicitly adds 4 to the address >> in write_pmu() function, so the registers won't line up. This would be an >> issue for events that need to be in specific counters, e.g. >> DATA_REFERENCES_SET0. >> > Why won't they align? > Aligned compared to what Oprofile uses? Oprofile numbers the ia64 performance monitoring counters starting at 0 and Perfmon starts the counters (pmd) at 4. -Will |
From: stephane e. <er...@go...> - 2008-07-17 20:58:20
|
Will, On Thu, Jul 17, 2008 at 10:49 PM, William Cohen <wc...@re...> wrote: > > So there would need to be some autoconf check to determine that > pfm_getinfo_evtsets() exists on the machine before the package is built if > that approach is taken. > Sure. >>> Noticed that the daemon/opd_perfmon.c code implicitly adds 4 to the >>> address >>> in write_pmu() function, so the registers won't line up. This would be an >>> issue for events that need to be in specific counters, e.g. >>> DATA_REFERENCES_SET0. >>> >> Why won't they align? >> Aligned compared to what Oprofile uses? > > Oprofile numbers the ia64 performance monitoring counters starting at 0 and > Perfmon starts the counters (pmd) at 4. > On Itanium 2 PMD0-PMD4 do exist, they are simply not counters but rather buffers. That's why there cannot be alignment with what Oprofile does. The mapping of perfmon for all Itanium processors is: identity map. |
From: William C. <wc...@nc...> - 2008-07-15 21:43:08
|
Maynard Johnson wrote: > William Cohen wrote: >> Maynard Johnson wrote: >>> William Cohen wrote: >>>> Maynard Johnson wrote: >>>>> OProfile 0.9.4 Release Candidate 3 is now available and can be found >>>>> at: >>>>> >>>>> http://sourceforge.net/project/showfiles.php?group_id=16191 >>>>> > [snip] >>>> On RHEL-5 x86-64, i386, and ia64, built binutils with -fPIC (and >>>> installing >>>> libtool RPM), then built oprofile without the java support and ran the >>>> tests. The smoke tests worked okay on the i386 and x86_64. Need to >>>> take a closer look at what is going on with the smoke tests on ia64 >>>> (RHEL oprofile-0.9.3-16.el5version worked much better); the underlying >>>> pfmon appears to work on the ia64 machine, but the new oprofile is >>>> getting the >>>> following error in ia64 machine: >>>> >>>> Couldn't allocate hardware counters for the selected events. >>> Curious, especially considering the recently applied patch to >>> op_alloc_counter.c >>> >>> So far, this is the only issue raised against rc3. I had hoped we >>> could GA 0.9.4 by the end of the week, but I'll hold off until I hear >>> from you about the underlying cause of this problem. >>> >>> Thanks. >>> -Maynard >>>> >>>> The detailed cpu types used for the tests. >>>> >>>> Distro/arch /dev/oprofile/cpu_type >>>> -------------------------------------- >>>> F-9 x86_64 i386/core_2 >>>> RHEL 5 x86_64 i386/p4-ht >>>> RHEL 5 i386 timer >>>> RHEL 5 ia64 ia64/itanium2 >>>> >>>> >>>> -Will >> Hi Maynard, >> >> I took a closer look at what is going on. You were right that the >> op_alloc_counter.c patch was causing a problem. The ia64 is unusual in >> that it does the actual setup of performance monitoring hardware through >> perfmon. /dev/oprofile doesn't have any performance register >> directories, so op_get_counter_mask returns 0. I have attached a >> proposed patch that assumed that perfmon is managing the counters and >> looks up the number if there are no counters directories available. This >> isn't ideal. Assumes perfmon managing and all the registers are >> available if there are 0 counters. >> >> I tried out the patched oprofile on ia64, x86_64, and i386. Things work >> better with the patch. > So at some point in the past, it was determined that the ia64 oprofile kernel > driver would not create the counter directories in /dev/oprofile. In > opcontrol:set_ctr_param(), I see a test for "IS_PERFMON", resulting in an > immediate exit if true. Given the ia64 kerenel driver's current behavior and > how opcontrol has been modified to adjust to it, your patch seems as good as we > can get. But it seems to me this leaves open the very possibility your previous > patch was trying to prevent -- i.e., that something like the watchdog timer > holding could have allocated a PMU counter and oprofile has no way of knowing > about it. > > *Stephane*, can you provide any guidance on this? > > Thanks. Yes, Maynard. that is exactly the problem I was thinking of. -Will |
From: Maynard J. <may...@us...> - 2008-07-17 15:41:44
|
William Cohen wrote: > Maynard Johnson wrote: > >> William Cohen wrote: >> >>> Maynard Johnson wrote: >>> >>>> William Cohen wrote: >>>> >>>>> Maynard Johnson wrote: >>>>> >>>>>> OProfile 0.9.4 Release Candidate 3 is now available and can be found >>>>>> at: >>>>>> >>>>>> http://sourceforge.net/project/showfiles.php?group_id=16191 >>>>>> >>>>>> >> [snip] >> >>>>> On RHEL-5 x86-64, i386, and ia64, built binutils with -fPIC (and >>>>> installing >>>>> libtool RPM), then built oprofile without the java support and ran the >>>>> tests. The smoke tests worked okay on the i386 and x86_64. Need to >>>>> take a closer look at what is going on with the smoke tests on ia64 >>>>> (RHEL oprofile-0.9.3-16.el5version worked much better); the underlying >>>>> pfmon appears to work on the ia64 machine, but the new oprofile is >>>>> getting the >>>>> following error in ia64 machine: >>>>> >>>>> Couldn't allocate hardware counters for the selected events. >>>>> >>>> Curious, especially considering the recently applied patch to >>>> op_alloc_counter.c >>>> >>>> So far, this is the only issue raised against rc3. I had hoped we >>>> could GA 0.9.4 by the end of the week, but I'll hold off until I hear >>>> from you about the underlying cause of this problem. >>>> >>>> Thanks. >>>> -Maynard >>>> >>>>> The detailed cpu types used for the tests. >>>>> >>>>> Distro/arch /dev/oprofile/cpu_type >>>>> -------------------------------------- >>>>> F-9 x86_64 i386/core_2 >>>>> RHEL 5 x86_64 i386/p4-ht >>>>> RHEL 5 i386 timer >>>>> RHEL 5 ia64 ia64/itanium2 >>>>> >>>>> >>>>> -Will >>>>> >>> Hi Maynard, >>> >>> I took a closer look at what is going on. You were right that the >>> op_alloc_counter.c patch was causing a problem. The ia64 is unusual in >>> that it does the actual setup of performance monitoring hardware through >>> perfmon. /dev/oprofile doesn't have any performance register >>> directories, so op_get_counter_mask returns 0. I have attached a >>> proposed patch that assumed that perfmon is managing the counters and >>> looks up the number if there are no counters directories available. This >>> isn't ideal. Assumes perfmon managing and all the registers are >>> available if there are 0 counters. >>> >>> I tried out the patched oprofile on ia64, x86_64, and i386. Things work >>> better with the patch. >>> >> So at some point in the past, it was determined that the ia64 oprofile kernel >> driver would not create the counter directories in /dev/oprofile. In >> opcontrol:set_ctr_param(), I see a test for "IS_PERFMON", resulting in an >> immediate exit if true. Given the ia64 kerenel driver's current behavior and >> how opcontrol has been modified to adjust to it, your patch seems as good as we >> can get. But it seems to me this leaves open the very possibility your previous >> patch was trying to prevent -- i.e., that something like the watchdog timer >> holding could have allocated a PMU counter and oprofile has no way of knowing >> about it. >> >> *Stephane*, can you provide any guidance on this? >> >> Thanks. >> > > Yes, Maynard. that is exactly the problem I was thinking of. -Will > Given Stephane's answer and results of your investigation (in other postings to the list), it sounds like some kernel driver work in the arch/ia64/oprofile directory is needed to handle the extraordinary cases. Nevertheless, it seems like your patch would at least keep ia64-oprofile working in equivalent fashion to what it does today. Do you agree? If so, could you re-post the patch with a change log and a Signed-off-by line (as described at http://oprofile.sourceforge.net/contribute/), and I'll commit it. Thanks. -Maynard |
From: William C. <wc...@re...> - 2008-07-17 18:40:05
Attachments:
oprofile-0.9.4-allocate.patch
|
Maynard Johnson wrote: <... snip ...> > Given Stephane's answer and results of your investigation (in other > postings to the list), it sounds like some kernel driver work in the > arch/ia64/oprofile directory is needed to handle the extraordinary > cases. Nevertheless, it seems like your patch would at least keep > ia64-oprofile working in equivalent fashion to what it does today. Do > you agree? If so, could you re-post the patch with a change log and a > Signed-off-by line (as described at > http://oprofile.sourceforge.net/contribute/), and I'll commit it. > > Thanks. > -Maynard > > Hi Maynard, Yes, I agree this fix will help things in the short term. In the long term other fixes will be desired. This patch allows oprofile 0.9.4 to work on ia64 machines. The ia64 oprofile uses perfmon to manage the performance monitoring hardware rather than the oprofile driver. As a result there are no number directories in /dev/oprofile and it appears on ia64 there are no performance monitoring hardware available. The patch assumes that perfmon is managing the hardware if there no performance monitoring register directories (/dev/oprofile/[0-9]+) found. The patch has been tested out on ia64, i386/p4-ht (64-bit). The suggested change log entry is: 2008-07-17 William Cohen <wc...@re...> * libop/op_alloc_counter.c: Assume perfmon managing hw when no counters Signed-off-by: William Cohen <wc...@re...> --- |