Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.


oprofile Log

Commit Date  
[961b96] by Maynard Johnson Maynard Johnson

Fix compile error on precise_ip field in early versions of perf_event.h

Early "perf_events" kernels did not yet include the "precise_ip" field
in the perf_event_attr struct (defined in perf_event.h). This patch
makes configure check for the existence of that field, and it defines
a new macro, HAVE_PERF_PRECISE_IP, which will be set to '1' if the
field exists or '0' otherwise. The operf_counter.cpp code that was
failing to compile will now use the precise_ip field conditionally,
based of the HAVE_PERF_PRECISE_IP macro.

Signed-off-by: Maynard Johnson <>
Acked by: Andi Kleen

2013-07-22 15:27:42 Tree
[2dcd13] by William Cohen William Cohen , pushed by Maynard Johnson Maynard Johnson

Avoid changing the number formatting for cout and cerr streams

Coverity static tool found a number of places where number formatting
was changed and might lead to some oddly formatted output for later
stream output. This patch ensures that the number formatting only
applies to that particular message.

Signed-off-by: William Cohen <>

2013-06-05 18:14:43 Tree
[6ee980] by Maynard Johnson Maynard Johnson

Fix Coverity errors found on May 20, 2013 git snapshot

Coverity identified the following errors on scans run from May 7 through
May 20, 2013:

Wrapper object use after free,Memory - illegal accesses,/agents/jvmpi/jvmpi_oprofile.cpp,compiled_method_load(JVMPI_Event *)
Unchecked return value,Error handling issues,/daemon/opd_mangling.c,opd_open_sample_file
Dereference after null check,Null pointer dereferences,/daemon/opd_sfile.c,sfile_hash
Uninitialized scalar field,Uninitialized members,/gui/oprof_start_config.cpp,config_setting::config_setting()
Division or modulo by zero,Integer handling issues,/libdb/db_stat.c,odb_hash_stat
Resource leak,Resource leaks,/libop/op_cpu_type.c,_auxv_fetch
Resource leak,Resource leaks,/libop/op_cpu_type.c,fetch_at_hw_platform
Negative array index read,Memory - illegal accesses,/libop/op_events.c,_is_um_valid_bitmask
Write to pointer after free,Memory - corruptions,/libop/op_events.c,read_events
Read from pointer after free,Memory - illegal accesses,/libop/op_events.c,_is_um_valid_bitmask
Dereference after null check,Null pointer dereferences,/libop/op_mangle.c,op_mangle_filename
Dereference after null check,Null pointer dereferences,/libop/op_mangle.c,op_mangle_filename
Time of check time of use,Security best practices violations,/libopagent/opagent.c,op_open_agent
Improper use of negative value,Integer handling issues,/libperf_events/operf_counter.cpp,operf_record::setup()
Double free,Memory - corruptions,/libperf_events/operf_counter.cpp,operf_record::setup()
Uninitialized pointer read,Memory - illegal accesses,/libperf_events/operf_counter.cpp,<unnamed>::_get_perf_event_from_file(mmap_info &)
Unchecked return value,Error handling issues,/libperf_events/operf_mangling.cpp,"operf_open_sample_file(odb_t *, operf_sfile *, operf_sfile *, int, int)"
Using invalid iterator,API usage errors,/libperf_events/operf_process_info.cpp,operf_process_info::try_disassociate_from_parent(char *)
Non-array delete for scalars,Memory - illegal accesses,/libregex/op_regex.cpp,"<unnamed>::op_regerror(int, const re_pattern_buffer &)"
Resource leak,Resource leaks,/libutil++/op_bfd.cpp,"op_bfd::op_bfd(const std::basic_string<char, std::char_traits<char>, std::allocator<char>>&, const string_filter &, const extra_images &, bool &)"
Explicit null dereferenced,Null pointer dereferences,/opjitconv/create_bfd.c,fill_symtab
Resource leak,Resource leaks,/opjitconv/opjitconv.c,_cleanup_jitdumps
Use of untrusted string value,Insecure data handling,/opjitconv/opjitconv.c,main
Resource leak,Resource leaks,/pe_profiling/operf.cpp,_get_cpu_for_perf_events_cap()
Dereference null return value,Null pointer dereferences,/pe_profiling/operf.cpp,_process_session_dir()
Incorrect deallocator used,API usage errors,/pe_profiling/operf.cpp,_process_events_list()


This patch fixes those errors.

Signed-off-by: Maynard Johnson <>

2013-05-28 13:19:25 Tree
[74abfb] by Maynard Johnson Maynard Johnson

Fix Coverity issues identified against oprofile 0.9.8 release

Signed-off-by: Maynard Johnson <>

2013-05-15 18:14:43 Tree
[c82b96] by Maynard Johnson Maynard Johnson

Use PMC5/PMC6 on ppc64 arch for run cycles/run instructions

The IBM Power processor architecture (ppc64) counts instructions
and cycles on PMC5 and PMC6 (respectively) when the run latch is
set (i.e., when not in idle state). On POWER6, these counters
were not capable of generating interrupts, so they could not be
used for profiling purposes; therefore, oprofile counted those
events (PM_RUN_INST_CMPL and PM_RUN_CYC) using other counters.
But with the newer POWER7 processor, PMC5 and PMC6 can generate
interrupts, so it makes sense to leverage those two counters
instead of using the other 4 (programmable) counters. Doing
so could, theoreticaly allow us to count up to 6 events
simultaenously without the kernel having to do multiplexing.

This patch will force PM_RUN_INST_CMPL and PM_RUN_CYC to be
counted on PMC5 and PMC6 (respectively) when running on an
IBM POWER7 system.

Signed-off-by: Maynard Johnson <>

2013-04-26 19:05:28 Tree
[a39d41] by Maynard Johnson Maynard Johnson

Fix holes in operf system-wide profiling of forked processes

Using operf to do system-wide profiling of the specjbb benchmark
exposed some holes in how operf was processing the perf_events
data coming from the kernel. Some of the events we can get from
the kernel are:

The "COMM" event is to notify us of the start of an executable
application. The "FORK" event tells us when a process forks
another process. The "MMAP" event informs us when a shared library
(or executable anonymous memory, or the executable file itself, etc.)
has been mmap'ed into a process's address space. A "SAMPLE"
event occurs each time the kernel takes a sample for a process.

There is no guarantee in what order these events may arrive from
the kernel, and when a large system (say, 64 CPUs) is running
the specjbb benchmark full bore, with all processors pegged to
100%, you can get some very strange out-of-order looking
sequence of events. Things get even stranger when using Java7
versus Java6 since Java7 spawns many more threads.

The operf code had several issues where such out-of-order
events were not handled properly, so some major changes were
required in the code.

Signed-off-by: Maynard Johnson <>

2013-04-25 15:53:37 Tree
[3aa2fe] by Maynard Johnson Maynard Johnson

oprofile pp tools should print messages about lost samples

When operf completes running, it collects statistics about
lost samples, records them in the operf.log, and prints
a warning message if the number of lost samples exceeds
a pre-defined percentage (.01%) of the total number of
samples. However, when opreport or any of the other oprofile
post-processing tools are run, the statistics are not
readily available (only in the operf.log), so there is no
warning about lost samples. This patch persists those
statistics to files in the <session-dir>/samples/current/stats
dir, allowing the pp tools to access them later. These
stats files are also copied by oparchive, so even archived
profile data will have the statistics available.

Signed-off-by: Maynard Johnson <>

2013-04-19 12:26:32 Tree
[18c4a6] by Maynard Johnson Maynard Johnson

Performance improvement for operf's perf_event-to-oprofile format conversion

This patch decreases the time needed for converting sample records from
perf_events format to oprofile sample file format by about 1/3. The
performance improvement is most notable when doing system-wide profiling
of a busy system and specifying '--lazy-conversion', where the conversion
process runs after a (long) profiling session has been ended.

Signed-off-by: Maynard Johnson <>

2013-03-19 17:57:08 Tree
[7cf28f] by Carl Love Carl Love , pushed by Maynard Johnson Maynard Johnson

operf, remove support to report multiplexing.

The multiplexing reporting doesn't work correctly when profiling
multi-threaded apps or apps that do fork/exec. The detection of
multiplexing doesn't work when processes migrate between CPUs.
The event is enabled on all CPUs. The running time stops when the
event migrates to another CPU however, the enabled time does not stop as it
is enabled on each CPU. The issue is that the running time across CPUs
doesn't add up to the enabled time because of the running time is not
increasing while the process is being migrated. This results in the running
time being less then the enabled time. There is no way to detect if the
reason the running time is less then the enabled time was do to migration
or due to multiplexing.

The support is being removed so that the operf tool is not incorrectly
flagging events for multiplexing.

Signed-off-by: Carl Love <>

2013-03-12 15:52:15 Tree
[04ee55] by Maynard Johnson Maynard Johnson

Fix seg fault due to incorrect array size initialization

The operf_sfile.cpp:create_sfile function creates and initializes the
array of 'struct operf_sfile' objects used for writing sample data to
oprofile formatted sample files. This function creates an array of
these objects, but was incorrectly creating an array of size 'OP_MAX_COUNTER'.
Since operf can multiplex events, we aren't limited to OP_MAX_COUNTER
events to profile simultaneously, so if the user specifies more than
OP_MAX_COUNTER events, the code that accesses this array was going
off the end and sometimes seg faulting.

This patch fixes the problem by defining OP_MAX_EVENTS to be '24' and
using that as the array size. Furthermore, if the user tries to specify
more than 24 events to profile, an error message is displayed:
Number of events specified is greater than allowed maximum of <n>
and operf aborts.

Signed-off-by: Maynard Johnson <>

2013-03-07 15:04:46 Tree
[a1f2b6] by Maynard Johnson Maynard Johnson

Make convertPerfData procedure more robust

The operf_read::convertPerfData function reads sample data
in perf_events format from either the temporary
file or a pipe, depending on whether or not operf is run
with the --lazy-conversion option. This patch makes the
reading/conversion process more robust so that if bad
data is found in the file or pipe, the process will display
helpful messages and end gracefully.

This patch also makes some other minor cleanups, correcting
some misspellings, etc.

Signed-off-by: Maynard Johnson <>

2013-03-05 18:05:59 Tree
[79a183] by Maynard Johnson Maynard Johnson

The configure check to determine whether we should use libpfm or not
is intended only for the ppc64 architecture, but was incorrectly
hitting on the ppc32 architecture, too. Not only that, but it was using
'uname' which is not a good idea in cross-compile situtations.

Then, aside from that, we had several instances in the source code
of the following:
#if (defined(__powerpc__) || defined(__powerpc64__))
which incorrectly included ppc32 architecutre also, when it was intended
for use as PPC64 architecture.

This patch fixes both errors.

Signed-off-by: Maynard Johnson <>

2013-02-27 21:41:14 Tree
[7e5e18] by Maynard Johnson Maynard Johnson

operf does not properly sample child threads for already-running app

Example: When passing the 'java' command directly to operf, samples are
collected for all of the threads created by the JVM. However, if the
Java app is already running when the user starts operf with either
'--pid' or '--system-wide' option, zero samples are collected on the
child threads of the JVM. Note: The user program that is JITed by the
JVM is executed by a child thread.

This patch addresses the problem by:
- Keeping a list of child processes
- Synthesizing PERF_RECORD_COMM events for the main JVM process and all
the child processes
- Calling perf_event_open for the main JVM process and all child processes

These changes entailed some fairly major restructuring of some functions
and data structures of the operf_record class.

Signed-off-by: Maynard Johnson <>

2013-02-21 17:09:10 Tree
[81ccb4] by Maynard Johnson Maynard Johnson

operf does not run opjitconv if --pid or --system-wide used

To stop operf when either '--pid' or '--system-wide' option is used, the user
must do a ctrl-C (or 'kill -SIGINT <operf_pid>'. If the user has not passed
'--lazy-conversion', the operf.cpp:convert_sample_data function is run as a
child process that does not have a SIGINT handler set up for it at the time
it's reading sample data from the pipe (which is being written to by the
operf-record process). The end result is that the operf-read process is
interrupted and stopped by the unhandled ctrl-C before it gets a chance to
run opjitconv.

Note: Another (minor) side effect of issue #1 above is that there may be
sample data left un-read in the pipe, and the type of app being profiled
(Java or not) is irrelevant.

This patch addresses the problem by cleaning up operf signal handler
procedures, making it clear which handlers are used by parent and which
by children, and then making sure those handlers are set up at the correct
time. I also found an extraneous unused signal handler defined in
operf_utils.cpp that I removed.

Signed-off-by: Maynard Johnson <>

2013-02-11 23:14:34 Tree
[cd5c7d] by Maynard Johnson Maynard Johnson

operf: Fix 'Permission denied' error on early perf_events kernels

The new operf tool available with OProfile 0.9.8 uses the perf_event_open
syscall to obtain access to the performance monitor counters and registers.
This syscall is implemented by the Linux Kernel Performance Events Subsystem
(aka "perf_events"). This perf_events subsystem was introduced in kernel
version 2.6.31, and it underwent a lot of changes in the first several versions
thereafter. Apparently, the operf tool, as currently written and operating today,
relies on certain kernel functionaility that was introduced later than some
kernels provided with some Linux distributions that supported perf_events in the
very early stages (e.g.,SLES 11 SP1). When attempting to profile with operf
(e.g., 'operf ls'), it fails with the message:

Unexpected error running operf: Permission denied
Please use the opcontrol command instead of operf.

The fix for this problem is to pass '-1' for the cpu arg on the
perf_event_open syscall when running on an early perf_events kernel.
Passing '-1' for the cpu arg was a requirement (in most circumstances)
on early perf_events kernels. Later kernels removed this requirement
so perf_event_open could be called for each cpu, even for single-app
profiling by non-root users. This is the standard usage model employed
by operf, which allows us to mmap kernel data space for each cpu, thus
giving a lot more memory for the kernel to record sample data.

Signed-off-by: Maynard Johnson <>

2013-02-05 16:12:04 Tree
[646eeb] by Maynard Johnson Maynard Johnson

Fix 32-bit compilation error

Added a 'ULL' suffix to an u64 variable definition so that -m32
build would not fail.

Signed-off-by: Maynard Johnson <>

2013-01-24 21:09:29 Tree
[b2c445] by Carl Love Carl Love , pushed by Maynard Johnson Maynard Johnson

operf, add throttling and multiplexing stats

This patch checks to see if the event was throttled or multiplexed. The
events are recorded by creating a file with the name of the event in the
stats sub directory throttled or multiplexed respectively.

Functions are added to the post processing to print messages if multiplexing
and/or throttling occurred during the data collection.

The patch has been tested on an Intel Core(TM)2 Duo CPU T9400 2.53GHz
The following are excerpts from the script used to do the testing.

The events tested are as follows:

Each of the tests below were run with each of the following frequencies
to test with and without event throttling.



$path/operf --events $event1:$freq:0:1:1 --system-wide

$path/operf -l --events $event1:$freq:0:1:1 --system-wide

$path/operf --events $event1:$freq:0:1:1 --events $event2:$freq:0:1:1 --events
$event4:$freq:0:1:1 --events $event5:$freq:0:1:1 --events $event6:$freq:0:1:1
--events $event7:$freq:0:1:1 --system-wide

$path/operf -l --events $event1:$freq:0:1:1 --events $event2:$freq:0:1:1 --eve
nts $event4:$freq:0:1:1 --events $event5:$freq:0:1:1 --events $event6:$freq:0:
1:1 --events $event7:$freq:0:1:1 --system-wide

$path/operf --events $event1:$freq:0:1:1 dd bs=16 if=/dev/urandom of=/dev/nu
ll count=500000

$path/operf -l --events $event1:$freq:0:1:1 dd bs=16 if=/dev/urandom of=/dev
/null count=500000

$path/operf --events $event1:$freq:0:1:1 --events $event2:$freq:0:1:1 --events
$event4:$freq:0:1:1 --events $event5:$freq:0:1:1 --events $event6:$freq:0:1:1
--events $event7:$freq:0:1:1 dd bs=16 if=/dev/urandom of=/dev/null count=5000

$path/operf -l --events $event1:$freq:0:1:1 --events $event2:$freq:0:1:1 --ev
ents $event4:$freq:0:1:1 --events $event5:$freq:0:1:1 --events $event6:$freq:0
:1:1 --events $event7:$freq:0:1:1 dd bs=16 if=/dev/urandom of=/dev/null count

The tests described above were also performed on an IBM POWER7
3000.000000MHz revision : 2.1 with the the following events.


And the two sampling frequencies:


Signed-off-by: Carl Love <>

2013-01-23 15:19:47 Tree
[b655cc] by Marcin Juszkiewicz Marcin Juszkiewicz , pushed by Maynard Johnson Maynard Johnson

Add rmb() definition for AArch64 architecture

Signed-off-by: Marcin Juszkiewicz <>

2013-01-16 15:30:33 Tree
[e22684] by Carl Love Carl Love , pushed by Maynard Johnson Maynard Johnson

Oprofile operf: Fix the code to strip the _GRP## from the event name

The current code uses the call "strstr(, "_GRP")" to find the
substring for the group number at the end of the POWER events. The
strstr() function finds the first occurance of the substring processing
from left to right. This will find the string "_GRP_" in the name of
the event rather then the intended _GRP## at the end of the string. For
example the event name "PM_GRP_CMPL_GRP174" is currently change to "PM"
instead of "PM_GRP_CMPL". This patch makes a change in the calculation
for the strncpy() call to use the function rfind("_GRP") to return the index
of where the last instance of the substring is found. Basically the call
finds the first occurance of the substring by searching from right to

Signed-off-by: Carl Love <>

2012-12-06 15:23:25 Tree
[a8d9ee] by Ross Lagerwall Ross Lagerwall , pushed by Maynard Johnson Maynard Johnson

Don't close uninitialized file descriptor

Found with Valrind.

Signed-off-by: Ross Lagerwall <>

2012-11-30 14:59:08 Tree
[88989b] by Vineet Gupta Vineet Gupta , pushed by Maynard Johnson Maynard Johnson

Add support for ARC architecture to operf

Signed-off-by: Vineet Gupta <>

2012-11-26 21:13:49 Tree
[dbe24f] by Maynard Johnson Maynard Johnson

Handle early perf_events kernel without PERF_RECORD_MISC_GUEST* macros

In very early versions of perf_events kernel subsystem, the
macros (in perf_event.h) were not yet defined. This patch adds
a configure check to determine when it's OK for source code to refer
to those macros.

This patch also does some minor cleanup of the configure script
help and warning messages relating to the --with-kernel option.

Signed-off-by: Maynard Johnson <>

2012-11-19 21:16:37 Tree
[866abb] by Andi Kleen Andi Kleen , pushed by Maynard Johnson Maynard Johnson

Add the Haswell client event lists and model numbers

Also added simple support for PEBS events with perf_events
(ignored with the old driver) and include the Haswell PEBS events in the list.
And fixed "any" support.

v2: Regenerate events table with some improvements.
Address review feedback.

Signed-off-by: Andi Kleen <>

2012-11-08 01:14:47 Tree
[7e788a] by Maynard Johnson Maynard Johnson

Revert "Add the Haswell client event lists and model numbers"

This reverts commit 6d48ffa1e51e49ae3d3a5757baa7e2ed0d87d128.

Revert this commit since author info was wrong.

Signed-off-by: Maynard Johnson <>

2012-11-08 01:10:44 Tree
[6d48ff] by Maynard Johnson Maynard Johnson

Add the Haswell client event lists and model numbers
I also added simple support for PEBS events with perf_events
(ignored with the old driver) and include the Haswell PEBS events in the list.
And fixed "any" support.

v2: Regenerate events table with some improvements.
Address review feedback.

Signed-off-by: Andi Kleen <>

2012-11-08 00:57:34 Tree
Older >