oprofile Log

Commit Date  
[4d808a] by Maynard Johnson Maynard Johnson

Defaulted named unit mask does not work

A fix to allow named unit masks to be used as the default was recently
pushed upstream (Jun 24: Add support for named default unit masks), but
unfortunately, we all missed the fact that using a named default
unit mask didn't actually work insofar as counting events.
Here's an example on Sandybridge that should use the default named
unit mask "any":

operf -e uops_issued:2000000 ./my_test
and opreport shows "opreport error: No sample file found".

When the user does not specify a unit mask, the profiling tools
(as well as ocount) will use 'ophelp --unit-mask' to determine
what the default unit mask should be. All of the oprofile
tools -- operf, opcontrol, ocount -- expect a numerical value
to be returned. So, in the case of a named default unit mask,
the unit mask name returned by ophelp was not being handled
properly, and the end result was usually "No samples found"
by opreport (or zero event counts by ocount). This patch
fixes this problem.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-07-24 17:20:03 Tree
[2e5055] by Maynard Johnson Maynard Johnson

Fix unit mask value for EXTRA_NONE

For named unit masks that do not have real extra values
associated with them, the 'extra' field of the unit mask
description structure is set to EXTRA_NONE, which is not
really a valid value as far as the performance monitoring
hardware goes. In such cases, 'ophelp --extra-mask'
was returning EXTRA_NONE to the operf and opcontrol
profilers, which results in no samples being collected,
since it's not a valid mask value. In such cases, ophelp
should return the simple unit mask value. The operf
and opcontrol tools that use 'ophelp --extra-mask' must
be able to differentiate between a simple mask value
and an "extra" value. Anything greater or equal to
0x40000 is interpreted as a valid "extra" value; otherwise
it's a simple mask value.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-06-24 14:44:07 Tree
[6ee980] by Maynard Johnson Maynard Johnson

Fix Coverity errors found on May 20, 2013 git snapshot

Coverity identified the following errors on scans run from May 7 through
May 20, 2013:

Wrapper object use after free,Memory - illegal accesses,/agents/jvmpi/jvmpi_oprofile.cpp,compiled_method_load(JVMPI_Event *)
Unchecked return value,Error handling issues,/daemon/opd_mangling.c,opd_open_sample_file
Dereference after null check,Null pointer dereferences,/daemon/opd_sfile.c,sfile_hash
Uninitialized scalar field,Uninitialized members,/gui/oprof_start_config.cpp,config_setting::config_setting()
Division or modulo by zero,Integer handling issues,/libdb/db_stat.c,odb_hash_stat
Resource leak,Resource leaks,/libop/op_cpu_type.c,_auxv_fetch
Resource leak,Resource leaks,/libop/op_cpu_type.c,fetch_at_hw_platform
Negative array index read,Memory - illegal accesses,/libop/op_events.c,_is_um_valid_bitmask
Write to pointer after free,Memory - corruptions,/libop/op_events.c,read_events
Read from pointer after free,Memory - illegal accesses,/libop/op_events.c,_is_um_valid_bitmask
Dereference after null check,Null pointer dereferences,/libop/op_mangle.c,op_mangle_filename
Dereference after null check,Null pointer dereferences,/libop/op_mangle.c,op_mangle_filename
Time of check time of use,Security best practices violations,/libopagent/opagent.c,op_open_agent
Improper use of negative value,Integer handling issues,/libperf_events/operf_counter.cpp,operf_record::setup()
Double free,Memory - corruptions,/libperf_events/operf_counter.cpp,operf_record::setup()
Uninitialized pointer read,Memory - illegal accesses,/libperf_events/operf_counter.cpp,<unnamed>::_get_perf_event_from_file(mmap_info &)
Unchecked return value,Error handling issues,/libperf_events/operf_mangling.cpp,"operf_open_sample_file(odb_t *, operf_sfile *, operf_sfile *, int, int)"
Using invalid iterator,API usage errors,/libperf_events/operf_process_info.cpp,operf_process_info::try_disassociate_from_parent(char *)
Non-array delete for scalars,Memory - illegal accesses,/libregex/op_regex.cpp,"<unnamed>::op_regerror(int, const re_pattern_buffer &)"
Resource leak,Resource leaks,/libutil++/op_bfd.cpp,"op_bfd::op_bfd(const std::basic_string<char, std::char_traits<char>, std::allocator<char>>&, const string_filter &, const extra_images &, bool &)"
Explicit null dereferenced,Null pointer dereferences,/opjitconv/create_bfd.c,fill_symtab
Resource leak,Resource leaks,/opjitconv/opjitconv.c,_cleanup_jitdumps
Use of untrusted string value,Insecure data handling,/opjitconv/opjitconv.c,main
Resource leak,Resource leaks,/pe_profiling/operf.cpp,_get_cpu_for_perf_events_cap()
Dereference null return value,Null pointer dereferences,/pe_profiling/operf.cpp,_process_session_dir()
Incorrect deallocator used,API usage errors,/pe_profiling/operf.cpp,_process_events_list()


This patch fixes those errors.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-05-28 13:19:25 Tree
[068c03] by Maynard Johnson Maynard Johnson

Add support for Intel Netburst (e.g., Pentium P4) to operf

The "legacy" oprofile kernel driver has special "p4" handling. There's
a map of event codes to ESCR/CCCR values. Unfortunately, the P4 event
codes (stored in events/i386/p4/events) that are used by the oprofile
kernel driver don't match what perf_events kernel code expects. This
patch adds some p4-specific event code handling to operf in order to
generate the correct encoding to pass to perf_events kernel subsystem.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-05-22 13:27:15 Tree
[6dc5d3] by Maynard Johnson Maynard Johnson

Add support for architected events for IBM ppc64 architecture

The Power ISA 2.07 was published at http://power.org/. This ISA
formally defines base performance monitoring facilities which
must be provided by any processor implementation of the ISA.
Specific implementations may provide additional features, but
must include the standard architected features.

This patch creates a generic ppc64 cpu type called
"ppc64/architected_events_v1" that has a list of events which
are defined in the ISA 2.07 performance monitoring unit
architecture section. This new generic type will only be
supported by operf. It will *not* be supported by the legacy
oprofile kernel driver and opcontrol-based profiler. This
new cpu type can be used in situations where oprofile is running
on a kernel that does not have full native support for an
ISA 2.07-based ppc64 processor, but does have the base level
architected support. OProfile userspace code detects such a
situation by inspecting the auxiliary vector of the operf program
and comparing AT_PLATFORM and AT_BASE_PLATFORM values (defined
in elf.h).

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-05-17 13:45:53 Tree
[74abfb] by Maynard Johnson Maynard Johnson

Fix Coverity issues identified against oprofile 0.9.8 release

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-05-15 18:14:43 Tree
[3aa2fe] by Maynard Johnson Maynard Johnson

oprofile pp tools should print messages about lost samples

When operf completes running, it collects statistics about
lost samples, records them in the operf.log, and prints
a warning message if the number of lost samples exceeds
a pre-defined percentage (.01%) of the total number of
samples. However, when opreport or any of the other oprofile
post-processing tools are run, the statistics are not
readily available (only in the operf.log), so there is no
warning about lost samples. This patch persists those
statistics to files in the <session-dir>/samples/current/stats
dir, allowing the pp tools to access them later. These
stats files are also copied by oparchive, so even archived
profile data will have the statistics available.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-04-19 12:26:32 Tree
[7cf28f] by Carl Love Carl Love , pushed by Maynard Johnson Maynard Johnson

operf, remove support to report multiplexing.

The multiplexing reporting doesn't work correctly when profiling
multi-threaded apps or apps that do fork/exec. The detection of
multiplexing doesn't work when processes migrate between CPUs.
The event is enabled on all CPUs. The running time stops when the
event migrates to another CPU however, the enabled time does not stop as it
is enabled on each CPU. The issue is that the running time across CPUs
doesn't add up to the enabled time because of the running time is not
increasing while the process is being migrated. This results in the running
time being less then the enabled time. There is no way to detect if the
reason the running time is less then the enabled time was do to migration
or due to multiplexing.

The support is being removed so that the operf tool is not incorrectly
flagging events for multiplexing.

Signed-off-by: Carl Love <cel@us.ibm.com>

2013-03-12 15:52:15 Tree
[04ee55] by Maynard Johnson Maynard Johnson

Fix seg fault due to incorrect array size initialization

The operf_sfile.cpp:create_sfile function creates and initializes the
array of 'struct operf_sfile' objects used for writing sample data to
oprofile formatted sample files. This function creates an array of
these objects, but was incorrectly creating an array of size 'OP_MAX_COUNTER'.
Since operf can multiplex events, we aren't limited to OP_MAX_COUNTER
events to profile simultaneously, so if the user specifies more than
OP_MAX_COUNTER events, the code that accesses this array was going
off the end and sometimes seg faulting.

This patch fixes the problem by defining OP_MAX_EVENTS to be '24' and
using that as the array size. Furthermore, if the user tries to specify
more than 24 events to profile, an error message is displayed:
Number of events specified is greater than allowed maximum of <n>
and operf aborts.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-03-07 15:04:46 Tree
[a1f2b6] by Maynard Johnson Maynard Johnson

Make convertPerfData procedure more robust

The operf_read::convertPerfData function reads sample data
in perf_events format from either the temporary operf.data
file or a pipe, depending on whether or not operf is run
with the --lazy-conversion option. This patch makes the
reading/conversion process more robust so that if bad
data is found in the file or pipe, the process will display
helpful messages and end gracefully.

This patch also makes some other minor cleanups, correcting
some misspellings, etc.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-03-05 18:05:59 Tree
[79a183] by Maynard Johnson Maynard Johnson

The configure check to determine whether we should use libpfm or not
is intended only for the ppc64 architecture, but was incorrectly
hitting on the ppc32 architecture, too. Not only that, but it was using
'uname' which is not a good idea in cross-compile situtations.

Then, aside from that, we had several instances in the source code
of the following:
#if (defined(__powerpc__) || defined(__powerpc64__))
which incorrectly included ppc32 architecutre also, when it was intended
for use as PPC64 architecture.

This patch fixes both errors.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-02-27 21:41:14 Tree
[7e5e18] by Maynard Johnson Maynard Johnson

operf does not properly sample child threads for already-running app

Example: When passing the 'java' command directly to operf, samples are
collected for all of the threads created by the JVM. However, if the
Java app is already running when the user starts operf with either
'--pid' or '--system-wide' option, zero samples are collected on the
child threads of the JVM. Note: The user program that is JITed by the
JVM is executed by a child thread.

This patch addresses the problem by:
- Keeping a list of child processes
- Synthesizing PERF_RECORD_COMM events for the main JVM process and all
the child processes
- Calling perf_event_open for the main JVM process and all child processes

These changes entailed some fairly major restructuring of some functions
and data structures of the operf_record class.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-02-21 17:09:10 Tree
[81ccb4] by Maynard Johnson Maynard Johnson

operf does not run opjitconv if --pid or --system-wide used

To stop operf when either '--pid' or '--system-wide' option is used, the user
must do a ctrl-C (or 'kill -SIGINT <operf_pid>'. If the user has not passed
'--lazy-conversion', the operf.cpp:convert_sample_data function is run as a
child process that does not have a SIGINT handler set up for it at the time
it's reading sample data from the pipe (which is being written to by the
operf-record process). The end result is that the operf-read process is
interrupted and stopped by the unhandled ctrl-C before it gets a chance to
run opjitconv.

Note: Another (minor) side effect of issue #1 above is that there may be
sample data left un-read in the pipe, and the type of app being profiled
(Java or not) is irrelevant.

This patch addresses the problem by cleaning up operf signal handler
procedures, making it clear which handlers are used by parent and which
by children, and then making sure those handlers are set up at the correct
time. I also found an extraneous unused signal handler defined in
operf_utils.cpp that I removed.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-02-11 23:14:34 Tree
[366ca2] by Suravee Suthikulpanit Suravee Suthikulpanit

Fix build issue with gcc-4.7.2 due to fgets

gcc complains about ignoring the return value of fgets.
Since building with -Werror, the build failed.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

2013-02-08 16:23:06 Tree
[cd5c7d] by Maynard Johnson Maynard Johnson

operf: Fix 'Permission denied' error on early perf_events kernels

The new operf tool available with OProfile 0.9.8 uses the perf_event_open
syscall to obtain access to the performance monitor counters and registers.
This syscall is implemented by the Linux Kernel Performance Events Subsystem
(aka "perf_events"). This perf_events subsystem was introduced in kernel
version 2.6.31, and it underwent a lot of changes in the first several versions
thereafter. Apparently, the operf tool, as currently written and operating today,
relies on certain kernel functionaility that was introduced later than some
kernels provided with some Linux distributions that supported perf_events in the
very early stages (e.g.,SLES 11 SP1). When attempting to profile with operf
(e.g., 'operf ls'), it fails with the message:

Unexpected error running operf: Permission denied
Please use the opcontrol command instead of operf.

The fix for this problem is to pass '-1' for the cpu arg on the
perf_event_open syscall when running on an early perf_events kernel.
Passing '-1' for the cpu arg was a requirement (in most circumstances)
on early perf_events kernels. Later kernels removed this requirement
so perf_event_open could be called for each cpu, even for single-app
profiling by non-root users. This is the standard usage model employed
by operf, which allows us to mmap kernel data space for each cpu, thus
giving a lot more memory for the kernel to record sample data.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-02-05 16:12:04 Tree
[cd8be5] by Maynard Johnson Maynard Johnson

Fix opreport header info on unit mask when operf is run without a UM specified

When a user runs operf and profiles with an event that needs a unit mask value,
the default unit mask value will be used if no UM value is specified. When
opreport prints its header information, you get something like the following:

CPU: Intel Sandy Bridge microarchitecture, speed 2.401e+06 MHz (estimated)
Counted int_misc events (Instruction decoder events) with a unit mask of 0x00
(rat_stall_cycles Cycles Resource Allocation Table (RAT) external stall is
sent to Instruction Decode Queue (IDQ) for this thread.) count 2000000

Notice that the unit mask value '0x00' is shown, even though the code actually
selects the default unit value of 0x40 for the int_misc event.

This patch fixes this issue. It also partially addresses the issue with
named unit mask showing up as '0x00' in opreport, too (see oprofile bug
It's not a very good solution to the named unit mask issue, but it's a better
than nothing until we can come up with a final (better) solution.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-01-28 22:20:46 Tree
[b2c445] by Carl Love Carl Love , pushed by Maynard Johnson Maynard Johnson

operf, add throttling and multiplexing stats

This patch checks to see if the event was throttled or multiplexed. The
events are recorded by creating a file with the name of the event in the
stats sub directory throttled or multiplexed respectively.

Functions are added to the post processing to print messages if multiplexing
and/or throttling occurred during the data collection.

The patch has been tested on an Intel Core(TM)2 Duo CPU T9400 2.53GHz
The following are excerpts from the script used to do the testing.

The events tested are as follows:

Each of the tests below were run with each of the following frequencies
to test with and without event throttling.



$path/operf --events $event1:$freq:0:1:1 --system-wide

$path/operf -l --events $event1:$freq:0:1:1 --system-wide

$path/operf --events $event1:$freq:0:1:1 --events $event2:$freq:0:1:1 --events
$event4:$freq:0:1:1 --events $event5:$freq:0:1:1 --events $event6:$freq:0:1:1
--events $event7:$freq:0:1:1 --system-wide

$path/operf -l --events $event1:$freq:0:1:1 --events $event2:$freq:0:1:1 --eve
nts $event4:$freq:0:1:1 --events $event5:$freq:0:1:1 --events $event6:$freq:0:
1:1 --events $event7:$freq:0:1:1 --system-wide

$path/operf --events $event1:$freq:0:1:1 dd bs=16 if=/dev/urandom of=/dev/nu
ll count=500000

$path/operf -l --events $event1:$freq:0:1:1 dd bs=16 if=/dev/urandom of=/dev
/null count=500000

$path/operf --events $event1:$freq:0:1:1 --events $event2:$freq:0:1:1 --events
$event4:$freq:0:1:1 --events $event5:$freq:0:1:1 --events $event6:$freq:0:1:1
--events $event7:$freq:0:1:1 dd bs=16 if=/dev/urandom of=/dev/null count=5000

$path/operf -l --events $event1:$freq:0:1:1 --events $event2:$freq:0:1:1 --ev
ents $event4:$freq:0:1:1 --events $event5:$freq:0:1:1 --events $event6:$freq:0
:1:1 --events $event7:$freq:0:1:1 dd bs=16 if=/dev/urandom of=/dev/null count

The tests described above were also performed on an IBM POWER7
3000.000000MHz revision : 2.1 with the the following events.


And the two sampling frequencies:


Signed-off-by: Carl Love <cel@us.ibm.com>

2013-01-23 15:19:47 Tree
[d3a2c6] by Maynard Johnson Maynard Johnson

Fix compile warnings/errors with gcc 4.7.3

On some distros, the struct poptOption in /usr/include/popt.h
has the argInfo field defined as int, but on other distros,
that field is defined as unsigned int. In libopt++/popt_options.cpp,
the option_base::option_base constructor passes an unsigned int
popt_flags argument that's intended to be assigned to the
argInfo field. With gcc 4.7.1, the following warning(error) occurs
on systems where the argInfo field is defined as an int:

popt_options.cpp: In constructor `popt::option_base::option_base
(const char*, char, const char*, const char*, void*, unsigned int)':
popt_options.cpp:255:51: error: narrowing conversion of `popt_flags'
from `unsigned int' to `int' inside { } is ill-formed in C++11 [-Werror=narrowing]
cc1plus: all warnings being treated as errors

The fix for this problem is to cast the popt_flags to the appropriate
type using 'typeof(opt.argInfo)'.

The second compile error (in pe_profiling/operf.cpp) is happening
because the variable 'value' is assigned, but not used after that.
This is dead code that should be removed.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-01-21 16:38:14 Tree
[360280] by Maynard Johnson Maynard Johnson

Allow ppc64 events to be specified with or without _GRP<n> suffix

All events for IBM PowerPC server processors (except CYCLES) have
a _GRP<n> suffix. This is because the legacy opcontrol profiler
can only profile events in the same group (i.e., having the same
_GRP<n> suffix). But operf has no such restriction because it
can multiplex events; thus, so we should allow the user to pass
event names without the _GRP<n> suffix.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-01-11 19:29:57 Tree
[55d11f] by Maynard Johnson Maynard Johnson

Fix unused variable compile error on non-x86 type architectures

Commit e1ed25f091af2128497f8d8f78e27e0330155094 that was made on
Jan 2 causes a compile error on non-x86 architecteures. This
patch fixes that error.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-01-09 18:19:33 Tree
[e1ed25] by Maynard Johnson Maynard Johnson

Fix operf default unit mask handling

This patch addresses the problem reported to the oprofile-list having
subject heading of "other events than CPU_CLK_UNHALTED not working"

The operf tool mis-handles event specifications where the
unit mask is not specified, usually resulting in some bogus
config value that's passed to the perf_event_open call.
The end result is usually that opreport finds no samples.
In some cases, samples may be recorded, but they would
not be for the correct unit mask.

In lieu of applying this patch, the workaround for this bug is
to specify the default unit mask: e.g,
operf -e LLC_MISSES:6000:0x41 <my-app>

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2013-01-02 17:32:00 Tree
[883c7c] by Maynard Johnson Maynard Johnson

Fix operf handling of <cur-dir>/app when "." is in PATH

For a description of this problem, see oprofile bug #3566769:

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2012-12-21 17:03:20 Tree
[e2e1e1] by Maynard Johnson Maynard Johnson

Fix bug in finding command in PATH

If operf has to search PATH to obtain the full pathname of the
passed app (or command) and a segment in PATH does not exist,
the following error will occur:

<cmd> cannot be found in your PATH.

The workaround is to specify the full pathname of the command
when passing it to operf or remove the non-existent segment in
the PATH environment variable.

This patch fixes this problem in operf.

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2012-11-21 19:47:09 Tree
[69656e] by Ross Lagerwall Ross Lagerwall , pushed by Maynard Johnson Maynard Johnson

Simplify argument handling by not converting to string and back again.
Instead, create a new array of strings referencing each argument.
This makes arguments with spaces in them get passed to the command
properly rather than being split up.
E.g. operf echo a b "c d"

Signed-off-by: Ross Lagerwall <rosslagerwall@gmail.com>

2012-11-19 22:31:05 Tree
[ad3e04] by Maynard Johnson Maynard Johnson

Fix up problems found by another run of coverity

Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>

2012-08-08 19:46:05 Tree
Older >

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks