[f44fee]: pp_interface  Maximize  Restore  History

Download this file

606 lines (417 with data), 16.8 kB

Draft version 4
General FIXME: call graphs - for a later spec ?

1. Basic principles

There are five basic things a user needs to specify to the post-profile tools :

a) common options such as --help, --version
b) specifications of which sample files to operate upon
c) processing options (what gets included in the workings) 
d) output and formatting options (what gets output, and how)
e) "special" options 

Of these, obviously a), and hopefully b) and some of c) can be universally
shared between the interfaces.

2. Common basic options

2.1 --help (-?)

	As generated by popt.

2.2 --usage

	As generated by popt.

2.3 --version (-v) 

	As currently, "opsummary: oprofile 0.3cvs compiled on May 15 2002 16:43:40"

2.4 --verbose (-V)

	Maybe we can have more fine-grained --debug later, but for now this will do

2.5 --no-verbose

	Default, for symmetry only 

2.6 --no-demangle (n/a)

	We demangle by default. This is for avoiding demangler bugs and the like.

2.7 --demangle

	For symmetry only 

2.8 FIXME: --smart-demangle - must make a decision

3. Profile specifications

Each tool needs at least one sample file + associated binary image to operate on,
and some tools (opdiff, opsummary) can have more than one. It would
be very useful for the user to be able to conveniently specify a subset of profiles,
and we can use the same method for every tool.

3.1 Each profile is a tuple of :

a) Session name
b) Binary
c) App binary (for shared libraries, e.g. /bin/ls using /lib/libc.so)
d) Event name
e) Event count
f) Group ID (tgid)
g) Process ID (pid)
h) CPU nr.
i) the actual binary to bfd_open
j) the actual samples file to use 

3.2 Some of these may be "null" parameters, namely c),f),g),h),i),j) (and possibly e) depending
on what we do about multiplexing).

i) is a special case since it is not part of the profile as such. By default
it is derived from b), but allowing its specification is needed for opdiff
of two differently compiled binary + profile pairs. Similar goes for j)

So if we provide a way to specify some set of profiles via the command line,
we need to support each of these. My first idea is something like this :

opreport session:run1 image:/usr/bin/oprofiled

Here's a complete list of tags :

3.3	sample-file: <filename>
		An actual sample file. If this is given, no other tags are allowed,
		with the exception of binary:
3.4	binary: <filename>
		The binary to use. This is incompatible with the image: parameter,
		and may only be specified as part of a single-profile specification.
		It may be used ONLY with sample-file:

3.5	session: <sessionlist>
		A comma-separated list of session names to resolve in. Absence of this
		tag, unlike all others, means "the current session", equivalent to
		specifying "session: current"

3.6	session-exclude: <sessionlist>
		A comma list of sessions to exclude ...
3.7	image: <imagelist>
		Comma list of image names to resolve. Each entry may be relative
		path, or wordexp() style name, or full path, e.g.

		"image:" is default tag, allowing old-style "opreport /usr/bin/oprofiled"

3.8	image-exclude: <imagelist> 
		Comma list of images to exclude from the final list

3.9	lib-image: <imagelist>
		Comma list of library images to consider. Thus when looking for glibc samples
		due to oprofiled, opreport /usr/bin/oprofiled 'lib-image:*libc*' will show only
		those. If not specified, these samples are NOT included by default, EXCEPT
		in the presence of automerging if image:*libc*, for example.

3.10	lib-image-exclude: <imagelist>
		Similar to image-exclude ...
3.11	event: event
		Specify a particular event type e.g. CPU_CLK_UNHALTED

3.12	count: count
		Specify the count for that event e.g. 30000

3.13	unit-mask: mask
		Specify the unit mask for that event. Encoded as numerical value.

3.14	cpu: <cpulist>
		Comma-separated list of cpu numbers to consider

3.15	pid: <pidlist>
		Comma list of pid's to consider. This defines a thread group
		including all child threads (task->tgid)

3.16	tid: <tidlist>
		A specific thread (task->pid)

3.17 If a tag is not specified, then it will match all values (with the exception of
     session: as noted).
Note: we can use a common method for comma-separated lists :

list ::= listitem ',' list | listitem
listitem ::= listchar listchar*
listchar ::= '\,' | '*' | '?' | alphanum 

i.e. we allow escape for entries with commas in it, and we allow
We can use cpu,pid,event,count,unit-mask formats as parameters
to the daemon too. But we must allow multiple event,count pairs
so the syntax must be different.

3.18 FIXME: how to find 2.5 modules ? Also, -p/-P equivalents

Sample file mangling

We need to encode a number of the above things in the filename.

3.19 The proposed format is


The latter is to be used when using --separate=lib|kernel

(in decimal where relevant) 
For example,


3.20 The paths must be fully resolved of symbolic links.

3.21 The parts tgid,pid,cpu may, instead of a number, have the
value "all", for when profiling method is not splitting

3.22 The current session is defined as "current", all other sessions are free-form

Resolution algorithm

session=as given or default to "current"
all others set to "*" (match on anything) by default

3.33 for each (file in session(s)) {
	next if samplefile && samplefile!=file;
	next if image && !included(file.image, imagelist)
	next if image-exclude && included(file.image, image-exclude)
	next if lib-image && !included(file.lib-image, lib-imagelist)
	next if lib-image-exclude && included(file.lib-image, lib-image-exclude)
	next if event && file.event != event
	next if count && file.count != count
	next if unitmask && file.unit_mask != unitmask
	next if cpu && !included(file.cpu, cpulist)
	next if pid && !included(file.pid, pidlist)
	next if tid && !included(file.tid, tidlist)

3.23 if (one_profile_allowed_only) { 
	// merge on tgid,pid,cpu
	// and also lib-image, iff the lib-image
	// is specified as imagename
	if (nr. profiles > 1)

4. op_alias

The above is rather tedious syntax, and is likely to stay that way
even with improvements. So we can have a command line tool / config file, e.g. :

4.1 # cat .oprofile/aliases
run1-oprofiled session:run1 image:/usr/bin/oprofiled

4.2 # opreport run1-oprofiled
for convenience. This is essentially a stored query facility ...
5. opreport

née oprofpp.

This tool is used for providing symbol-based summaries of a particular image.
The profile spec must resolve to exactly one profile, unless it is possible to
auto-merge the resolved list from the profile spec (see below). 

5.1 opreport <flags> <profile-spec>
	--symbols (-l)
	--debug (-b)
	--details (-a)
	--output <output spec> (-o) 
	--threshold <threshold> (-t)
	--include-dependent (-n)
	--sort <sort spec> (-s)
	--ignore-symbols <symbollist> (-i)
	--exclude-symbols <symbollist> (-e)
	<profile spec>

5.2 --include-dependent

	Include libs, kernel, modules, etc.

5.3 --symbols <symbol-list>

	This is the standard sorted summary produced by oprofpp, as specified
by -i, -e, -s, -o.

	5.3.1 The default for opreport will be "opreport --symbols \*"

	5.3.2 This must include --dependent symbols  when --include-dependent is set
5.4 --output <outputspec>

    What fields to output in what order, as follows :

	v       vma offset
	s       nr samples
	S       nr accumulated samples
	p       nr percent samples
	P       nr accumulated percent samples
	n       symbol name
	m	mangled symbol name
	l       source file name and line nr
	L       base name of source file and line nr
	i       image name (these two are useful for merged)
	I       base name of image name
	h       a leading header (not a field ...)
	d       detailed samples for each selected symbol
	x	reserved for XML output at some point

v,s,S,p,P,n,m,I,L,i,I :
0x8004434 433 34343 0.3% 4.5% blah(int) blah_Qv /home/steve.c:3 steve.c:3 /home/steve steve 
5.5 The 'h' output gives a leading free-form title describing the profile, and
    column headers as appropriate.
5.6 For 'd', the columns are the same, except that the "symbol" fields may show
    xmalloc+0x34, for example

5.7 'x' is reserved for now, and will be specified later.

5.8 The default format is hvspn. The following options modify this :

5.9	--debug

	Shows line number and file information (equivalent to 'hvspnL')

5.10	--details

	Shows line number and file information, and per-address values
	(equivalent to 'hvspnLd')

5.11	--include-dependent

	Default format becomes 'hvspni'

5.12 Specification of either --debug OR --details AND --output is a fatal error.

5.13 Note: --help must output this information and document the default
5.14 --sort <sortspec>

	What field to sort by, as follows :
	v       vma offset
	s       nr samples
	n       symbol name
	m	mangled symbol name
	l       source file name and line nr
	L       base name of source file and line nr
	i       image name (these two are useful for merged)
	I       base name of image name
	r	sort whatever in reverse
5.15 --ignore-symbols <symbollist>

	comma-separated list of symbols to ignore. Ignored symbols are included
	in percentage calculations etc, but not output

5.16 --exclude-symbols <symbollist>

	comma-separated list of symbols to exclude. Excluded symbols are neither
	included in calculations nor output

5.17 We accept mangled or unmangled names, but not a mix of the two. First the search
     tries to find verbatim symbol names. If exactly 0 matches are found, then every
     the search is repeated, but each match is done against a demangled symbol name. Thus
     mixing mangled with unmangled names will not work, and should be documented.
5.18 --threshold <threshold>

	Threshold of minimum values before an entry is printed. Either a sample count
	or a percentage, e.g. "1", "0.1%"
6. opgprof

Split off the gprof dumping part of old oprofpp, since it didn't really

6.1 opgprof
	--gprof-file (-o) <file>
	<profile spec>

6.2 Default to gmon.out, making a back up if it exists already 
7. opannotate
née op_to_source. Our basic source / asm annotation tool.

7.1 opannotate
	--source-dir <dir> (-d)
	--output-dir <dir> (-o)
	--base-dir <dir> (-b)
	--include <filelist> (-i)
	--exclude <filelist> (-e)
	--threshold <threshold> (-t)
	--source (-s)
	--assembly (-a)
	--mixed (-m)
	--objdump-params (-p)
	<profile spec>

7.2 Profile spec must produce exactly 1 profile after auto-merge etc.

7.3 --source

	specifies the source-based per-file annotation. --source-dir is
	used to find the source when debug info is relative. When --base-dir
	is specified, then only files in that dir or a subdirectory are generated.
	In the absence of --base-dir, this defaults to the value of --source-dir
	The directory structure of the source is replicated under --output-dir,
	with annotated files of the same name as the source file.
	Specifying a --source-dir == --output-dir is an error. Samples
	in files outside of --source-dir are ignored with a message.

7.4 The source annotation works by prefixing all lines with a certain size comment, e.g.

/* 99 (3.2%)   */ char some_function(void)
/*             */ {
/*             */      int i;
/*             */
/* 73 (2.3%)   */      for (i = 0; i < 4; i++) {
7.5 This is to keep the line numbers the same. A post fix comment is also
    added at the end of the file, along the lines of :

/* Generated from /path/to/some_function.c
 * CPU_CLK_UNHALTED (cycles CPU is not halted) with a count of 30000 
 * blah blah
7.6 --assembly

	specifies the objdump annotated asm output.  The output looks
	as above.

7.7 --mixed

	specified the src/asm objdump annotated output
7.8 --objdump-params

	Additional (?) params to objdump, e.g. for Intel syntax asm

8. opsummary

nee op_time. This is used for either global symbol- or image-based summaries
inside of one session only.

8.1 opsummary 
	--symbol-summary (-l)
	--debug (-b)
	--details (-a)
	--output <output spec> (-o) 
	--sort <sort spec> (-s)
	--merge <merge spec>
	--ignore-symbols <symbollist> (-i)
	--exclude-symbols <symbollist> (-e)
	--threshold <threshold> (-t)
	<profile spec>

All as before.

8.2 --symbol-summary

Turns on symbol-based global summary. Make sure to have a 
useful default output spec in these cases. --debug and --details
are only available with this option.
8.3 --merge <merge spec>

Unlike the single-profile tools, it can be useful for us to not auto-merge
profile specs giving more than one profile result. Two profiles of the same
thing on CPU #0, and CPU #1, we might want to keep separate in the summary.  So
we don't do it by default for opsummary, but allow the user to specify merges
as follows :


		Merge all cpu-specific profiles.


		Merge all process-id specific profiles.


		Merge all app-specific library profiles.

		Merge all compatible unit mask differences

8.4 FIXME: should be we consider an "all-kernel" (kernel + modules) profile ?
9. opdiff

opdiff is a tool for comparing exactly two profile-executable
pairs, on a symbol by symbol level.

Comparing global summaries is left for a future
specification (and tool) :)
Of the general form :

9.1 opdiff
	--ignore-symbols <symbollist> (-i)
	--exclude-symbols <symbollist> (-e)
	--output <outputspec> (-o)
	--sort <sortspec> (-s)
	--threshold <threshold> (-t)
	<profile spec 1> : <profile spec 2>

where 1 is to be diffed against 2. If each is more > 1 profile,
auto-merging should take place as before, if needed. We also
can use binary: in the profile spec.

9.2 We auto-merge as with opreport, the lhs,rhs or the ":"
must each specify exactly one profile
9.3 Two profile specs are incompatible iff

1. event differs or
2. count differs

9.4 NOTE: we consider all unit masks to be compatible, it is up
to the user to do this sensibly.
9.5 --threshold

	Slightly different. By default the threshold is applied
	absolutely, so --threshold 1 does not hide symbols with -232
	samples. Prefixing a - or + affects this behaviour, so
	--threshold +0 hides all negative entries, and --threshold -0
	hides all positive ones.

9.6 --output <outputspec>

    What fields to output in what order, as follows :

	v       vma offset
	s       nr samples (difference)
	p       nr percent samples (difference)
	n       symbol name
	m	mangled symbol name
	l       source file name and line nr
	L       base name of source file and line nr
	i       image name (these two are useful for merged)
	I       base name of image name
	h       a leading header (not a field ...)
	d       detailed samples for each selected symbol
	x	reserved for XML output at some point
e.g. s is

343 (+343)

p is

0.2% (+0.1)
9.7 --sort <sortspec>

	What field to sort by, as follows :
	v       vma offset
	s	nr samples difference
	n       symbol name
	m	mangled symbol name
	l       source file name and line nr
	L       base name of source file and line nr
	i       image name (these two are useful for merged)
	I       base name of image name
	r	sort whatever in reverse

To get the symbols that have lost most samples at the bottom,
for example, is "--sort sr"
10. auto merging

If a profile spec leaves e.g

	/usr/bin/oprofiled .... cpu #0
	/usr/bin/oprofiled .... cpu #1

or some other "mergable" set of profiles, and exactly 1 is required,
then we should automatically merge them into one profile for the purposes
required, along with a note to the user of what we've done.

10.1 Two profile specs are unmergable iff 

1. event differs or
2. count differs or
3. image or lib-image differs

10.2 Also consider :


We should expect something like :

# opreport /lib/libc.so
opreport: Merging the following profiles into this report :
	/lib/libc.so (/usr/bin/oprofiled)
	/lib/libc.so (/bin/ls)
	/lib/libc.so (/usr/bin/apache)

So the message would ideally indicate exactly the differing parts
that got merged. (it is a matter of documentation + user education
to tell them that if they want just the apache profile for libc,
they do opreport /usr/bin/apache lib-image:/lib/libc.so )

Sometimes we might not want to them to merge though, e.g. opsummary to
show the differing process IDs separately for a number of images. So that's
why we have --merge where appropriate.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

No, thanks