Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

oprofile Log


Commit Date  
[e55a4a] by Maynard Johnson Maynard Johnson

Cleanup TODO list

I removed some obsolete stuff and added some new, but there
are likely still some TODOs in this file that are not valid
any longer.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-10-24 19:49:03 Tree
[fb9529] by Maynard Johnson Maynard Johnson

Fix operf/ocount default unit mask selection

Many events (particularly in the x86* architectures)
require a unit mask value to specify the exact event
type. For such events, a default unit mask value
is assigned. When a user runs operf, ocount, or
opcontrol and specifies such an event but does not
specify a unit mask, the default unit mask should be
selected and used by the tool. A bug was discovered
with operf and ocount where the unit mask value in
this situation was being set to '0' instead of the
default unit mask value. This patch fixes the bug.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-10-15 19:58:16 Tree
[4b1497] by Andi Kleen Andi Kleen , pushed by Maynard Johnson Maynard Johnson

Add support for Intel Silvermont processor

Just add the event list for Intel Silvermont based systems
(Avoton, BayTrail) and the usual changes for a new CPU.
No new code otherwise.

The model number list is incomplete at this point, more will
be added in the future.

I also finally removed the top level event list descriptions.
All the events are only described in the unit masks now
(Intel doesn't really have a top level event, and I had
to invent descriptions, which was error prone and
often wrong)

I also removed some outdated document number references.

Signed-off-by: Andi Kleen <ak@linux.intel.com>

2013-10-10 18:12:28 Tree
[a2811b] by Maynard Johnson Maynard Johnson

configure error message for missing libpfm is not informative enough

On the ppc64 architecture, the libpfm library is used to get perf_events
encodings for events, so the configure script checks for the availability
of that library when building for ppc64. If the library is missing, the
configure error message is:

checking for perfmon/pfmlib.h... no
configure: error: pfmlib.h not found; usually provided in papi devel package

However, some newer distros (like Fedora 19) are now delivering separate
packages for libpfm and papi, instead of bundling them together. The patch
provided herein changes the configure message to reflect that change in
packaging.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-10-09 19:27:54 Tree
[ebde58] by Maynard Johnson Maynard Johnson

Converge operf and ocount utility functions

When the ocount tool was developed, a number of utility
functions were needed that were very similar to operf utility
functions, with just minor changes. The decision was made at
the time to copy these functions into ocount and change them
as needed. To avoid dual maintenance on very similar functions,
we should converge the two tools to use one common set of utility
functions. The main reason for not doing so in the first place
was to make it easier to review ocount patches and not have to
look at operf changes at the same time.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-10-09 18:12:21 Tree
[3795ee] by Maynard Johnson Maynard Johnson

Add two new POWER8 events that are needed for stall analysis

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-09-25 16:15:30 Tree
[b91794] by Ting Liu Ting Liu , pushed by Maynard Johnson Maynard Johnson

Add freescale e6500 support

Signed-off-by: Zhenhua Luo <zhenhua.luo@freescale.com>
Signed-off-by: Ting Liu <b28495@freescale.com>

2013-09-05 12:45:52 Tree
[ca3f79] by Ting Liu Ting Liu , pushed by Maynard Johnson Maynard Johnson

Add freescale e500mc support

Signed-off-by: George Stephen <Stephen.George@freescale.com>
Signed-off-by: Zhenhua Luo <zhenhua.luo@freescale.com>
Signed-off-by: Ting Liu <b28495@freescale.com>

2013-09-05 12:43:55 Tree
[08241f] by Maynard Johnson Maynard Johnson

Fix compile error on ppc/uClibc platform: 'AT_BASE_PLATFORM' undeclared'

This issue was reported via bug #245.

The method for obtaining cpu type on the ppc64 platform was recently
modified to detect the case when we're running on a kernel that has
not been updated to recognize the native processor type. The cpu
type returned in the case where the native processor type is newer
than POWER7 will be "CPU_PPC64_ARCH_V1" (architected CPU type).
The method used for detecting when the kernel does not recognize the
native processor type is to inspect the aux vector and compare
AT_PLATFORM and AT_BASE_PLATFORM. The 'AT_BASE_PLATFORM' was defined
in glibc's elf.h around 5 years ago, but was never added to uClibc,
so the code that implements the above-described method fails to compile
on systems using uClibc.

Since the above-described method of using the aux vector is only
required for ppc64 systems, and ppc64-based platforms always use glibc
(which has the AT_BASE_PLATFORM macro defined), we now wrap that code
with '#if PPC64_ARCH' to prevent problems on other architectures.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-08-14 20:40:44 Tree
[543be6] by Carl Love Carl Love , pushed by Maynard Johnson Maynard Johnson

OProfile, fix the units for the reported CPU frequency

The freqency of the processor is found by function op_cpu_frequency() in
libutil/op_cpufreq.c by either checking for the frequency in the file
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq or from /proc/cpuinfo.

Most of the Intel processors get the frequency from cpuinfo_max_freq. The
frequency stored in this file is stored in units of KHz for the Intel
processors I can check. I do not know what other architectures store the
CPU frequency in this file. When this value is printed, the value is printed
by the routine describe_cpu() in libpp/op_header.cpp the assumption is the
freqency is assumed to be MHz. For example, the following is what is printed
by my laptop after running opreport

Using /home/carll/oprofile_data/samples/ for samples directory.
CPU: Core 2, speed 2.534e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit ma\
sk of 0x00 (Unhalted core cycles) count 100000

Note the speed is stated as 2.534e+06 MHz that is Mega Mega Hz or Tera Hz which
is a factor of 1000 high.

The fix is to have the function op_cpu_freq_sys_devices() in
libutil/op_cpufreq.c to adjust the frequency read from the file
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq by dividing by 1000
to return the frequency in units of MHz. The following patch implements this
fix for reporting the estimated processor frequency.

Signed-off-by: Carl Love <cel@us.ibm.com>

2013-08-12 16:13:36 Tree
[d173bf] by Maynard Johnson Maynard Johnson

Bump version to 1.0.0git

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-30 13:39:02 Tree
[82ff1c] (RELEASE_0_9_9) by Maynard Johnson Maynard Johnson

Change version to 0.9.9 in preparation for GA of 0.9.9

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-29 20:09:49 Tree
[491aff] by Andi Kleen Andi Kleen , pushed by Maynard Johnson Maynard Johnson

Add some missing Haswell model numbers

Thanks to Sanjay Patel for noticing the missing ULT numbers.

Signed-off-by: Andi Kleen <ak@linux.intel.com>

2013-07-29 15:25:16 Tree
[e1488b] by Will Deacon Will Deacon , pushed by Maynard Johnson Maynard Johnson

ARM: events: increase minimum cycle period to 100k

On ARM, we intentionally leave the minimum event counters low since
the performance profile of the cores can vary dramatically between CPUs
and their implementations.

However, since the default event is CPU_CYCLES, it's best to err on the
side of caution and raise the limit to something more realistic so we
don't lock-up on the unsuspecting user (as opposed to somebody passing
an explicit event period).

This patch raises the CPU_CYCLES minimum event count to 100k on ARM.

Signed-off-by: Will Deacon <will.deacon@arm.com>

2013-07-29 15:19:36 Tree
[a7e408] by Robert Richter Robert Richter , pushed by Maynard Johnson Maynard Johnson

oprofile, doc: Fix missing xrefs

This patch fixes the following errors:

$ XML_CATALOG_FILES=xsl/catalog.xml xsltproc --nonet -o oprofile.html
--stringparam version 0.9.9git .../oprofile/doc/xsl/xhtml.xsl
.../oprofile/doc/oprofile.xml
ERROR: xref linking to controlling has no generated link text.
Error: no ID for constraint linkend: controlling.
ERROR: xref linking to controlling has no generated link text.
Error: no ID for constraint linkend: controlling.

Signed-off-by: Robert Richter <rric@kernel.org>

2013-07-29 15:15:50 Tree
[6adb42] by Maynard Johnson Maynard Johnson

ocount fails to handle ppc64 event PM_GRP_CMPL event

Changed ocount event handling for ppc64 architecture
to properly handle cases where the event name includes
the string "GRP".

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-25 16:24:57 Tree
[9ad545] by Maynard Johnson Maynard Johnson

opjitconv fails with "Floating point exception"

opjitconv fails with "Floating point exception" when it tries to
convert jit dump file, created by oprofile java agent libjvmti_oprofile.so.
A divide by "totaltime" when totaltime was zero caused the problem.
It is normal to have some symbol lifetife to be = 0 for very fast
invocations within one single time tick (life_start == life_end),
so this patch handles that.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Acked-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>

2013-07-24 17:37:47 Tree
[4d808a] by Maynard Johnson Maynard Johnson

Defaulted named unit mask does not work

A fix to allow named unit masks to be used as the default was recently
pushed upstream (Jun 24: Add support for named default unit masks), but
unfortunately, we all missed the fact that using a named default
unit mask didn't actually work insofar as counting events.
Here's an example on Sandybridge that should use the default named
unit mask "any":

operf -e uops_issued:2000000 ./my_test
and opreport shows "opreport error: No sample file found".

When the user does not specify a unit mask, the profiling tools
(as well as ocount) will use 'ophelp --unit-mask' to determine
what the default unit mask should be. All of the oprofile
tools -- operf, opcontrol, ocount -- expect a numerical value
to be returned. So, in the case of a named default unit mask,
the unit mask name returned by ophelp was not being handled
properly, and the end result was usually "No samples found"
by opreport (or zero event counts by ocount). This patch
fixes this problem.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-24 17:20:03 Tree
[e38e40] by Maynard Johnson Maynard Johnson

Fix ocount to work on POWER8

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-23 19:25:22 Tree
[55cc78] by Maynard Johnson Maynard Johnson

ophelp does not always detect duplicate numerical unit mask values

Some named unit masks have hex values that are duplicates of others in the
same UM entry, thus requiring the use of the name to disambiguate them.
Specifying one of the hex values that are duplicated should result in an
error -- but there are cases where it doesn't. For example:
Here's a properly working example on Sandybridge:
[mpjohn@oc1757000783 test-stuff]$ ophelp --check-events int_misc:2000000:0x3
Unit mask (0x3) is non unique.
Please specify the unit mask using the first word of the description

and here's a failing example:
[mpjohn@oc1757000783 test-stuff]$ ophelp --check-events l1d_pend_miss:2000000:0x1
2

This patch fixes the problem.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-23 11:41:26 Tree
[305a63] by William Cohen William Cohen , pushed by Maynard Johnson Maynard Johnson

Remove unused variable dirstat in op_open_agent function

Coverity reported the variable dirstat as being unused in
op_open_agen(). There did not seem to be much of a point keeping it
around, so go ahead and remove it.

Signed-off-by: William Cohen <wcohen@redhat.com>

2013-07-23 11:36:25 Tree
[961b96] by Maynard Johnson Maynard Johnson

Fix compile error on precise_ip field in early versions of perf_event.h

Early "perf_events" kernels did not yet include the "precise_ip" field
in the perf_event_attr struct (defined in perf_event.h). This patch
makes configure check for the existence of that field, and it defines
a new macro, HAVE_PERF_PRECISE_IP, which will be set to '1' if the
field exists or '0' otherwise. The operf_counter.cpp code that was
failing to compile will now use the precise_ip field conditionally,
based of the HAVE_PERF_PRECISE_IP macro.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Acked by: Andi Kleen andi@firstfloor.org

2013-07-22 15:27:42 Tree
[23c82e] by Maynard Johnson Maynard Johnson

ocount misinterpreted UM value for named UM with EXTRA_NONE

Specifying a named unit mask associated with a dummy extra field
appears to *always* result in zero event counts. This patch
applies the same fix that was committed to operf on Jun 21, 2013.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-19 14:21:49 Tree
[f2af00] by Maynard Johnson Maynard Johnson

Add -lrt flag for clock_gettime call

A last minute change in ocount to use clock_gettime
required linking with '-lrt'. I made the change locally
in testing, but neglected to add that change to my
post-review fixups patch. This patch corrects that.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-18 18:51:43 Tree
[175ed4] by Maynard Johnson Maynard Johnson

Post-review fixups for new ocount feature

This patch fixes the issues raised during the review of the
'ocount' tool. Some of the issues were raised on the
oprofile-list, and some were raised internally by users
within my company. The issues raised internally were:
- Bug: Compile failure with recent gcc
- Request to display total time events were being counted in long output
- Bug: Counting multiple events in a run mode other than 'command [args]'
can result in incorrect output. For example:
ocount -s -e CPU_CLK_UNHALTED,UNHALTED_REFERENCE_CYCLES
Event counts (scaled) for the whole system:
Event Count % time enabled
CPU_CLK_UNHALTED 291,912,262 100.00
CPU_CLK_UNHALTED 27,431,626 100.00
- Bug: On ppc64 systems, event spec returned by _handle_powerpc_event_spec
may contain extra garbage after the event name.
- Request to change the --time-interval option to show counts just for
the interval, not cumulative counts

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-18 18:45:02 Tree
Older >