[93d633]: pp_interface Maximize Restore History

Download this file

pp_interface    557 lines (388 with data), 15.3 kB

Draft version 3
 
Beta.
 
1. Basic principles
-------------------

There are five basic things a user needs to specify to the post-profile tools :

a) common options such as --help, --version
b) specifications of which sample files to operate upon
c) processing options (what gets included in the workings) 
d) output and formatting options (what gets output, and how)
e) "special" options 

Of these, obviously a), and hopefully b) and some of c) can be universally
shared between the interfaces.

2. Common basic options
-----------------------

--help (-?)

	As generated by popt.

--usage

	As generated by popt.

--version (-v) 

	As currently, "op_summary: oprofile 0.3cvs compiled on May 15 2002 16:43:40"

--verbose (-V)

	Maybe we can have more fine-grained --debug later, but for now this will do

--no-verbose

	Default, for symmetry only 

--no-demangle (n/a)

	We demangle by default. This is for avoiding demangler bugs and the like.

--demangle

	For symmetry only 


3. Profile specifications
-------------------------

Each tool needs at least one sample file + associated binary image to operate on,
and some tools (op_diff, op_summary) can have more than one. It would
be very useful for the user to be able to conveniently specify a subset of profiles,
and we can use the same method for every tool.

Each profile is a tuple of :

a) Session name
b) Binary
c) App binary (for shared libraries, e.g. /bin/ls using /lib/libc.so)
d) Event name
e) Event count
f) parent process ID
g) Process ID
h) CPU nr.
i) the actual binary to bfd_open
j) the actual samples file to use 

Some of these may be "null" parameters, namely c),f),g),h),i),j) (and possibly e) depending
on what we do about multiplexing).

i) is a special case since it is not part of the profile as such. By default
it is derived from b), but allowing its specification is needed for op_diff
of two differently compiled binary + profile pairs. Similar goes for j)

So if we provide a way to specify some set of profiles via the command line,
we need to support each of these. My first idea is something like this :

op_report session:run1 image:/usr/bin/oprofiled

Here's a complete list of tags :

	sample-file: <filename>
		An actual sample file. If this is given, no other tags are allowed,
		with the exception of binary:
 
	binary: <filename>
		The binary to use. This is incompatible with the image: parameter,
		and may only be specified as part of a single-profile specification.
		It may be used ONLY with sample-file:

	session: <sessionlist>
		A comma-separated list of session names to resolve in. Absence of this
		tag, unlike all others, means "the current session"

	session-exclude: <sessionlist>
		A comma list of sessions to exclude ...
	 
	image: <imagelist>
		Comma list of image names to resolve. Each entry may be relative
		path, or wordexp() style name, or full path, e.g.
	
		'image:/usr/bin/oprofiled,*op*,./oprofpp'

		"image:" is default tag, allowing old-style "op_report /usr/bin/oprofiled"

	image-exclude: <imagelist> 
		Comma list of images to exclude from the final list

	lib-image: <imagelist>
		Comma list of library images to consider. Thus when looking for glibc samples
		due to oprofiled, op_report /usr/bin/oprofiled 'lib-image:*libc*' will show only
		those. If not specified, these samples are NOT included by default, EXCEPT
		in the presence of automerging if image:*libc*, for example.

	lib-image-exclude: <imagelist>
		Similar to image-exclude ...
 
	event: event
		Specify a particular event type e.g. CPU_CLK_UNHALTED

	count: count
		Specify the count for that event e.g. 30000

	unit-mask: mask
		Specify the unit mask for that event. Encoded as numerical value.

	cpu: <cpulist>
		Comma-separated list of cpu numbers to consider

	pid: <pidspeclist>
		Comma list of pid's to consider. A pidspec is either a literal
		process id (454), or a process id plus a '+', to indicate all
		child processes (recursive) as well as that pid (e.g. "454+")

If a tag is not specified, then it will match all values (with the exception of
session:).
 
Note: we can use a common method for comma-separated lists :

list ::= listitem ',' list | listitem
listitem ::= listchar listchar*
listchar ::= '\,' | '*' | '?' | alphanum 

i.e. we allow escape for entries with commas in it, and we allow
globbing.
 
We can use cpu,pid,event,count,unit-mask formats as parameters
to the daemon too. But we must allow multiple event,count pairs
so the syntax must be different.


3a. sample file mangling
------------------------

We need to encode a number of the above things in the filename.
The proposed format is

$SAMPLES_DIR/sessionname/event.count.unitmask.ppid.pid.cpu.mangledprimaryname}mangledlibname

(in decimal where relevant) 
For example,

CPU_CLK_UNHALTED.30000.0.434.436.0.}bin}ls}}lib}libc.so

mangledlib/primaryname will be absolute and fully resolved pathnames

The parts ppid,pid,cpu may, instead of a number, have the
value "all", for when profiling method is not splitting on
(p)pid/cpu

mangledlibname may be ""
 
3b. Resolution algorithm
------------------------

session=as given or default to currentsession 
all others set to "*" (match on anything) by default

for each (file in session(s)) {
	next if samplefile && samplefile!=file;
	next if image && !included(file.image, imagelist)
	next if image-exclude && included(file.image, image-exclude)
	next if lib-image && !included(file.lib-image, lib-imagelist)
	next if lib-image-exclude && included(file.lib-image, lib-image-exclude)
	next if event && file.event != event
	next if count && file.count != count
	next if unitmask && file.unit_mask != unitmask
	next if cpu && !included(file.cpu, cpulist)
	next if pid && !included(file.pid, pidlist)
	// if it's in the child tree of the give parent process id, accept it
	next if parentpid && !transitive_include(file.ppid, parentpid-graph)
 
	add_to_profile_list();
}

if (one_profile_allowed_only) { 
	// merge on ppid,pid,cpu
	// and also lib-image, iff the lib-image
	// is specified as imagename
	merge
	if (nr. profiles > 1)
		error
}

4. op_alias
-----------

The above is rather tedious syntax, and is likely to stay that way
even with improvements. So we can have a command line tool / config file, e.g. :

# cat .oprofile/aliases
run1-oprofiled session:run1 image:/usr/bin/oprofiled

# op_report run1-oprofiled
 
for convenience. This is essentially a stored query facility ...
 
5. op_report
------------

née oprofpp.

This tool is used for providing symbol-based summaries of a particular image.
The profile spec must resolve to exactly one profile, unless it is possible to
auto-merge the resolved list from the profile spec (see below). 

op_report <flags> <profile-spec>
	--symbols (-l)
	--output <output spec> (-o) 
	--threshold <threshold> (-t)
	--sort <sort spec> (-s)
	--ignore-symbols <symbollist> (-i)
	--exclude-symbols <symbollist> (-e)
	<profile spec>

--symbols <symbol-list>

	This is the standard sorted summary produced by oprofpp, as specified
by -i, -e, -s, -o. The default for op_report will be "op_report --symbols \*"
or something similar.
 
--output <outputspec>

    What fields to output in what order, as follows :

	v       vma offset
	s       nr samples
	S       nr accumulated samples
	p       nr percent samples
	P       nr accumulated percent samples
	n       symbol name
	m	mangled symbol name
	l       source file name and line nr
	L       base name of source file and line nr
	i       image name (these two are useful for merged)
	I       base name of image name
 
	h       a leading header (not a field ...)
	d       detailed samples for each selected symbol
	x	reserved for XML output at some point
 

v,s,S,p,P,n,m,I,L,i,I :
 
0x8004434 433 34343 0.3% 4.5% blah(int) blah_Qv /home/steve.c:3 steve.c:3 /home/steve steve 
 
The 'h' output gives a leading free-form title describing the profile, and
column headers as appropriate.
 
For 'd', the columns are the same, except that the "symbol" fields may show
xmalloc+0x34, for example

'x' is reserved for now, and will be specified later.
 
--sort <sortspec>

	What field to sort by, as follows :
 
	v       vma offset
	s       nr samples
	n       symbol name
	m	mangled symbol name
	l       source file name and line nr
	L       base name of source file and line nr
	i       image name (these two are useful for merged)
	I       base name of image name
 
	r	sort whatever in reverse
 
--ignore-symbols <symbollist>

	comma-separated list of symbols to ignore. Ignored symbols are included
	in percentage calculations etc, but not output

--exclude-symbols <symbollist>

	comma-separated list of symbols to exclude. Excluded symbols are neither
	included in calculations nor output

We accept mangled or unmangled names, but not a mix of the two. First the search
tries to find verbatim symbol names. If exactly 0 matches are found, then every
the search is repeated, but each match is done against a demangled symbol name. Thus
mixing mangled with unmangled names will not work, and should be documented.
 
--threshold <threshold>

	Threshold of minimum values before an entry is printed. Either a sample count
	or a percentage, e.g. "1", "0.1%"
 
6. op_gprof
-----------

Split off the gprof dumping part of old oprofpp, since it didn't really
fit.

op_gprof
	--gprof-file (-o) <file>
	<profile spec>

Default to gmon.out, making a back up if it exists already 
 
7. op_annotate
--------------
 
née op_to_source. Our basic source / asm annotation tool.

op_annotate
	--source-dir <dir> (-d)
	--output-dir <dir> (-o)
	--base-dir <dir> (-b)
	--include <filelist> (-i)
	--exclude <filelist> (-e)
	--threshold <threshold> (-t)
	--source (-s)
	--assembly (-a)
	--mixed (-m)
	<profile spec>

Profile spec must produce exactly 1 profile after auto-merge etc.

--source

	specifies the source-based per-file annotation. --source-dir is
	used to find the source when debug info is relative. When --base-dir
	is specified, then only files in that dir or a subdirectory are generated.
	In the absence of --base-dir, this defaults to the value of --source-dir
	The directory structure of the source is replicated under --output-dir,
	with annotated files of the same name as the source file.
	Specifying a --source-dir == --output-dir is an error. Samples
	in files outside of --source-dir are ignored with a message.

The source annotation works by prefixing all lines with a certain size comment, e.g.

/* 99 (3.2%)   */ char some_function(void)
/*             */ {
/*             */      int i;
/*             */
/* 73 (2.3%)   */      for (i = 0; i < 4; i++) {
...
 
This is to keep the line numbers the same. A post fix comment is also
added at the end of the file, along the lines of :

/* Generated from /path/to/some_function.c
 * CPU_CLK_UNHALTED (cycles CPU is not halted) with a count of 30000 
 * blah blah
 */
 
--assembly

	specifies the objdump annotated asm output

--mixed

	specified the src/asm objdump annotated output
 
8. op_summary
-------------

nee op_time. This is used for either global symbol- or image-based summaries
inside of one session only.

op_summary 
	--symbol-summary (-l)
	--output <output spec> (-o) 
	--sort <sort spec> (-s)
	--merge <merge spec>
	--ignore-symbols <symbollist> (-i)
	--exclude-symbols <symbollist> (-e)
	--threshold <threshold> (-t)
	<profile spec>

All as before.

--symbol-summary

Turns on symbol-based global summary. Make sure to have a 
useful default output spec in these cases.
 
--merge <merge spec>

Unlike the single-profile tools, it can be useful for us to not auto-merge
profile specs giving more than one profile result. Two profiles of the same
thing on CPU #0, and CPU #1, we might want to keep separate in the summary.  So
we don't do it by default for op_summary, but allow the user to specify merges
as follows :

	cpu

		Merge all cpu-specific profiles.

	pid

		Merge all process-id specific profiles.

	lib

		Merge all app-specific library profiles.
 
	unitmask

		Merge all compatible unit mask differences
 
9. op_diff
----------

op_diff is a tool for comparing exactly two profile-executable
pairs, on a symbol by symbol level.

Comparing global summaries is left for a future
specification (and tool) :)
 
Of the general form :

op_diff
	--ignore-symbols <symbollist> (-i)
	--exclude-symbols <symbollist> (-e)
	--output <outputspec> (-o)
	--sort <sortspec> (-s)
	--threshold <threshold> (-t)
	<profile spec 1> : <profile spec 2>

where 1 is to be diffed against 2. If each is more > 1 profile,
auto-merging should take place as before, if needed. We also
can use binary: in the profile spec.

We auto-merge as with op_report, the lhs,rhs or the ":"
must each specify exactly one profile
 
Two profile specs are incompatible iff

1. event differs or
2. count differs

NOTE: we consider all unit masks to be compatible, it is up
to the user to do this sensibly.
 
--threshold

	Slightly different. By default the threshold is applied
	absolutely, so --threshold 1 does not hide symbols with -232
	samples. Prefixing a - or + affects this behaviour, so
	--threshold +0 hides all negative entries, and --threshold -0
	hides all positive ones.

--output <outputspec>

    What fields to output in what order, as follows :

	v       vma offset
	s       nr samples (difference)
	p       nr percent samples (difference)
	n       symbol name
	m	mangled symbol name
	l       source file name and line nr
	L       base name of source file and line nr
	i       image name (these two are useful for merged)
	I       base name of image name
 
	h       a leading header (not a field ...)
	d       detailed samples for each selected symbol
	x	reserved for XML output at some point
 
e.g. s is

343 (+343)

p is

0.2% (+0.1)
 
--sort <sortspec>

	What field to sort by, as follows :
 
	v       vma offset
	s	nr samples difference
	n       symbol name
	m	mangled symbol name
	l       source file name and line nr
	L       base name of source file and line nr
	i       image name (these two are useful for merged)
	I       base name of image name
 
	r	sort whatever in reverse

To get the symbols that have lost most samples at the bottom,
for example, is "--sort sr"
 
10. auto merging
---------------

If a profile spec leaves e.g

	/usr/bin/oprofiled .... cpu #0
	/usr/bin/oprofiled .... cpu #1

or some other "mergable" set of profiles, and exactly 1 is required,
then we should automatically merge them into one profile for the purposes
required, along with a note to the user of what we've done.

Two profile specs are unmergable iff 

1. event differs or
2. count differs or
3. image or lib-image differs

Also consider :

	/lib/libc.so
		/usr/bin/oprofiled
		/bin/ls
		/usr/bin/apache

We should expect something like :

# op_report /lib/libc.so
op_report: Merging the following profiles into this report :
	/lib/libc.so (/usr/bin/oprofiled)
	/lib/libc.so (/bin/ls)
	/lib/libc.so (/usr/bin/apache)
....

So the message would ideally indicate exactly the differing parts
that got merged. (it is a matter of documentation + user education
to tell them that if they want just the apache profile for libc,
they do op_report /usr/bin/apache lib-image:/lib/libc.so )

Sometimes we might not want to them to merge though, e.g. op_summary to
show the differing process IDs separately for a number of images. So that's
why we have --merge where appropriate.