Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

[19da09]: pp_interface Maximize Restore History

Download this file

pp_interface    549 lines (378 with data), 16.5 kB

Version 1.0

Please don't re-number item in this file as code comment refer this number.
 
General FIXME: call graph stuff to improve ? update opstack as it is actively
developped


1. Basic principles
-------------------

There are five basic things a user needs to specify to the post-profile tools :

a) common options such as --help, --version
b) specifications of which sample files to operate upon
c) processing options (what gets included in the workings) 
d) output and formatting options (what gets output, and how) see 1.2
e) "special" options 

Of these, obviously a), and hopefully b) and some of c) can be universally
shared between the interfaces.

2. Common basic options
-----------------------

2.1 --help (-?)

	As generated by popt.

2.2 --usage

	As generated by popt.

2.3 --version (-v) 

	As currently, "opreport: oprofile 0.3cvs compiled on May 15 2002 16:43:40"

2.4 --verbose (-V)

	Maybe we can have more fine-grained --debug later, but for now this will do

2.5 --demangle=none,normal,smart

	normal is the default.

2.6, 2.7 intentionally empty to keep numbering

2.8 --image-path (-p) pathcommalist

	Extra path to search for missing binaries. This is used for finding
	2.5 modules "oprofile", and 2.4 initrd modules (e.g. /lib/module.o).
	Resolution is: if a name can be found directly, return that.
	Otherwise, the basename with ".ko" appended is searched for
	(including module name mangling).

	Multiple matches are not allowed (rationale: too easy to pick the wrong
	binary and get subtly misleading results)

2.9 --threshold <threshold>

	Threshold of minimum values before an entry is printed. a percentage,
	e.g. "--threshold=0.1" is 0.1%

	A trailing '%' is ignored, e.g. "--threshold=0.1%" is equivalent to
	"--threshold=0.1"

	Threshold get a different meaning for opdiff
 
	FIXME: currently threshold does not affect dependent images list in
	opreport. Should it ?

2.12
	2.5, 2.6 and 2.7 are not used by opgprof

3. Profile specifications
-------------------------

Each tool needs at least one sample file + associated binary image to operate on,
and some tools (opdiff, opreport) can have more than one. It would
be very useful for the user to be able to conveniently specify a subset of profiles,
and we can use the same method for every tool.

3.1 Each profile is a tuple of :

3.3) the actual samples file to use 
3.4) the actual binary to bfd_open
3.5/3.6) Session name
3.7) Binary
3.9) App binary (for shared libraries, e.g. /bin/ls using /lib/libc.so)
3.11) Event name
3.12) Event count
3.13) unit mask
3.14) CPU nr.
3.15) Task Group ID (tgid)
3.16) Task ID (tid)

FIXME: review the can be null parameters
3.2 Some of these may be "null" parameters, namely 3.3), 3.4), 3.9), 3.13),
3.14), 3.16), (and possibly 3.12) depending on what we do about
multiplexing).

3.4) is a special case since it is not part of the profile as such. By default
it is derived from 3.7), but allowing its specification is needed for opdiff
of two differently compiled binary + profile pairs. Similar goes for 3.4)

So if we provide a way to specify some set of profiles via the command line,
we need to support each of these. My first idea is something like this :

opreport session:run1 image:/usr/bin/oprofiled

It's not allowed to specify more than one time a tag and its value, all
 subsequent tag:value are fatal error.

Here's a complete list of tags :

3.3	sample-file: <filename>
		An actual sample file. Requires binary: and no other tags
 
3.4	binary: <filename>
		The binary to use. May only be used with sample-file:

3.5	session: <sessionlist>
		A comma-separated list of session names to resolve in. Absence of this
		tag, unlike all others, means "the current session", equivalent to
		specifying "session: current"

        FIXME: we don't deal with comma session lists properly yet for profile classes..

3.6	session-exclude: <sessionlist>
		A comma list of sessions to exclude ...
	 
3.7	image: <imagelist>
		Comma list of image names to resolve. Each entry may be relative
		path, or wordexp() style name, or full path, e.g.
	
		'image:/usr/bin/oprofiled,*op*,./oprofpp'

3.8	image-exclude: <imagelist> 
		Comma list of images to exclude from the final list, this are
		either library name or image name

3.9	lib-image: <imagelist>
		Comma list of library images to consider. Thus when looking for glibc samples
		due to oprofiled, opreport image:/usr/bin/oprofiled 'lib-image:*libc*' will show only
		those.

3.11	event: <eventlist>
		Comma list of events e.g. CPU_CLK_UNHALTED or the string all to specify any event

3.12	count: <countlist>
		Comma list of counts e.g. 30000 or the string all to specify any count

3.13	unit-mask: <masklist>
		Comma list of unit masks. Encoded as numerical value or the string all to specify any unit mask.

3.14	cpu: <cpulist>
		Comma-separated list of cpu numbers to consider

3.15	tgid: <tgidlist>
		Comma list of tgid's to consider. This defines a set of thread
		group including all child threads (task->tgid)

3.16	tid: <tidlist>
		A specific thread (task->tid)

3.17 If a tag is not specified, then it will match all values (with the exception of
     session: as noted).
 
Note: we can use a common method for comma-separated lists :

list ::= listitem ',' list | listitem
listitem ::= listchar listchar*
listchar ::= '\,' | '*' | '?' | alphanum 

i.e. we allow escape for entries with commas in it, and we allow
globbing.
 
3.18	default tag
		non option and non valid tag value are taken as "image name"
		but match as with image: *or* with lib-image:
		if an image name is "valid_tag:xxx" above shortcut doesn't
		work and user must use image:valid_tag:xxxx.

Sample file mangling
--------------------

We need to encode a number of the above things in the filename.

3.19 The format is

$SAMPLES_DIR/session/{root}/path/to/binary/event.count.unitmask.tgid.tid.cpu
$SAMPLES_DIR/session/{root}/path/to/binary/{dep}/{root}/path/to/lib/event.count.unitmask.tgid.tid.cpu

The latter is to be used when using --separate=lib (or with kernel, perhaps with {kern})

(in decimal where relevant) 
For example,

$SAMPLES_DIR/current/{root}/bin/ls/{dep}/{root}/lib/libc.so/CPU_CLK_UNHALTED.30000.0.434.434.0

Callgraph file are encoded as above + a {cg} components part:

/var/lib/oprofile/samples/current/{root}/bin/bash/{dep}/{root}/lib/libc-2.2.5.so/{cg}/{root}/lib/ld-2.2.5.so/CPU_CLK_UNHALTED.600000.0.all.all.all

Where the meaning of {dep} is different: for cg files it's the from (caller)
binary, the {cg} part is the to (callee) part. So the above contains arc owned
by application /bin/bash, arcs going from libc-2.2.5.so to ld-2.2.5.so. This
example is from a profiling with --separate=library


3.20 The paths must be fully resolved of symbolic links.

3.21 The parts tgid,tid,cpu may, instead of a number, have the
value "all", which matches any value

3.22 The current session is defined as "current", all other sessions are free-form

3.23 As well as "{root}" dir (representing the / fs), there is a "{kern}" dir,
which will contain the 2.5 modules and the vmlinux file when profiling
with --no-vmlinux,  e.g :

$SAMPLES_DIR/current/{kern}/oprofile/CPU_CLK_UNHALTED.30000.0.434.434.0
$SAMPLES_DIR/current/{root}/bin/ls/{dep}/{kern}/oprofile/CPU_CLK_UNHALTED.30000.0.434.434.0

  {kern} can be followed only by a single path component. The name following
a {kern} are first searched as it and, if not found, a second search is done
with a ".ko" suffix. FIXME: must we special case vmlinux and always add a .ko
to other filename

3.24 The encoding scheme used should simplify code since we parse the string
  at a syntactic point of view i.e. we don't have to look for existing path
  or file.

3.25 To clarify things here are the four allowed form of samples filename

{root}/path/to/bin/CPU_CLK_UNHALTED.100000.255.33.34.0
{kern}/name/CPU_CLK_UNHALTED.100000.255.33.34.0
{root}/path/to/bin/{dep}/{root}/path/to/lib/CPU_CLK_UNHALTED.100000.255.33.34.0
{root}/path/to/bin/{dep}/{kern}/name/CPU_CLK_UNHALTED.100000.255.33.34.0

Resolution algorithm
--------------------

session=as given or default to "current"
all others set to "*" (match on anything) by default

3.24 for each (file in session(s)) {
	next if samplefile && samplefile!=file;
	next if image && !included(file.image, imagelist)
	next if image-exclude && included(file.image, image-exclude)
	next if image-exclude && included(file.lib-image, image-exclude)
	next if lib-image && !included(file.lib-image, lib-imagelist)
	next if event && file.event != event
	next if count && file.count != count
	next if unitmask && file.unit_mask != unitmask
	next if cpu && !included(file.cpu, cpulist)
	next if tgid && !included(file.tgid, tgidlist)
	next if tid && !included(file.tid, tidlist)
 
	add_to_profile_list();
}

3.34 if (one_profile_allowed_only) { 
	// merge on tgid,tid,cpu
	// and also lib-image, iff the lib-image
	// is specified as imagename
	merge
	if (nr. profiles > 1)
		error
}

4. Empty to preserve section numbering was op_alias
---------------------------------------------------

5. opreport
------------

née oprofpp.

5.0 overall

This tool is used for providing symbol-based summaries of a particular set
of image (possibly including all image profiled). After selecting candidate
sample filename from the profile specification merge specification are applied.

5.1 opreport <flags> <profile-spec>
	--symbols (-l)
	--debug-info (-g)
	--details (-d)
	--sort <sort spec> (-s)
	--include-symbols <symbollist> (-i)
	--exclude-symbols <symbollist> (-e)
	--exclude-dependent (-x)
	--merge <merge spec> (-m)
	--no-header (-n)
	--long-filenames (-f)
	--accumulated (-c)
	--reverse-sort (-r)
	--show-address (-w)
	--global-percent (N/A)
	--xml (FIXME)
	--output-file (-o)
	<profile spec>

5.2 --exclude-dependent

	Exclude all dependent images from the profile.

5.3 --symbols <symbol-list>

	This is the standard sorted summary produced by oprofpp, as specified
by -i, -e, -s, -o.

	5.3.1 The default for opreport will be "opreport" (i.e. no --symbols)

	5.3.2 This must include dependent images, unless --exclude-dependent is
	specified
 
	Symbols without samples are not shown.

5.7	--xml

	Reserved for now, and will be specified later.

5.8 	The default output sample, percent, symbol. If more than app is shown,
	app image is added. If more than one image is shown, image is added.
	The following options also modify the output format :

5.9	--debug-info

	Shows line number and file information

5.10	--details

	Shows per-address samples values not only per-symbol samples,
	includes vma by default.
	The "symbol" fields may show xmalloc+0x34, for example

	This option implies --symbols

5.14 --sort <sortspec>

	comma separated list in :
 
	vma         vma offset
	sample      nr samples
	symbol      symbol name
	debug       source file name and line nr
	image       image name (this one is useful for merged)
	app-name    owning application

	sort is done int the field order specified by user then by the
	following for the unspecified: sample,image,app-name,symbol,debug,vma

	FIXME: clarify than order depend also of what we are showing
	(details, symbols line)
 
5.15 --include-symbols <symbollist>

	comma-separated list of symbols to include. This defaults to '*' (i.e.
	all symbols)

	FIXME: document inclusion/exclusion rules with --exclude-symbols

	FIXME: do we support e.g. "*_idle" or just "*" ?

5.16 --exclude-symbols <symbollist>

	comma-separated list of symbols to exclude. Excluded symbols are neither
	included in calculations nor output

5.17 We accept mangled or unmangled names, but not a mix of the two. First the search
     tries to find verbatim symbol names. If exactly 0 matches are found, then every
     the search is repeated, but each match is done against a demangled symbol name. Thus
     mixing mangled with unmangled names will not work, and should be documented.
 
5.20 FIXME: should be we consider an "all-kernel" (kernel + modules) profile ?

5.21 --merge <merge spec>

Unlike the single-profile tools, it can be useful for us to not auto-merge
profile specs giving more than one profile result. Two profiles of the same
thing on CPU #0, and CPU #1, we might want to keep separate in the summary.

So we don't do it by default for opreport, but allow the user to specify merges
as follows :

	cpu
		Merge all cpu-specific profiles.

	tgid

		Merge all process-id specific profiles (this merges all,
		regardless of tgid vs. tid). merging tgid imply merging tid

	tid
		Merge all process thread belonging to the same tgid

	lib
		Merge all app-specific library profiles.
 
	unitmask
		Merge all compatible unit mask differences

	all
		all the above

5.22 --no-header
	remove all header line from output

5.23 --long-filenames
	for all filename in output use the full path instead of the
	basenames (basenames being the default)

5.24 --accumulated
	all samples count or cumulated samples count will show the accumulated
	samples count

5.25 --reverse-sort
	sort whatever in reverse

5.26 --global-percent
	show samples count percent relative to the whole samples count
	for dependent images. Meaningless with --symbols.

5.28 --output-file <filename>
	send all output to filename rather to stdout, this is just an
	alternative to opreport .... > filename

5.29 --show-address

	Show VMA, off by default

6. opgprof
-----------

Split off the gprof dumping part of old oprofpp, since it didn't really
fit.

6.1 opgprof <file>
	--output-file (-o) filename
	--threshold (-t)
	--image-path (-p)
	<profile spec>

6.2 --output-file filename
	set the output filename, if not specified output filename is gmon.out

6.3 <file>
	acts as 3.18

6.4 opgrof do automatic merging on the set selected sample filename, it's an
error if more than one unmergeable profile exists. opgprof lookup in dependent
file (shared lib/module etc) only if neccessary
 
7. opannotate
--------------
 
née op_to_source. Our basic source / asm annotation tool.

7.1 opannotate
	--output-dir <dir> (-o)
	--search-dirs <dirs> (-d)
	--base-dirs <dirs> (-b)
	--include-file <filelist> (N/A)
	--exclude-file <filelist> (N/A)
	--source (-s)
	--assembly (-a)
	--objdump-params (N/A)
	--include-symbols <symbols list> (-i)
	--exclude-symbols <symbols list> (-e)
	--exclude-dependent (-x)
	--threshold (-t)
	<profile spec>

7.2 <deleted>
	
7.3 --source

	specifies the source-based per-file annotation. --search-dirs is
	used to find the source when debug info is relative. When --base-dirs
	is specified, a matching prefix is stripped, and then the remaining
	path is looked for in the given --search-dirs. --search-dirs is required
	if --base-dirs is specified.
	The directory structure of the source is replicated under --output-dir,
	with annotated files of the same name as the source file, and with full paths.
	Specifying a --source-dir == --output-dir is an error.

7.4 The source annotation works by prefixing all lines with a certain size comment, e.g.

 99 (3.2%)   :char some_function(void)
             :{ /* some_function total: 366220 19.22% */
             :       int i;
             :
 73 (2.3%)   :       for (i = 0; i < 4; i++) {
...

This is to keep the line numbers the same.

7.5. A post fix comment is also added at the end of the file, along the lines of :

/* Generated from /path/to/some_function.c
 * CPU_CLK_UNHALTED (cycles CPU is not halted) with a count of 30000 
 * blah blah
 */
 
7.6 --assembly

	specifies the objdump annotated asm output.  The output looks
	as above but all result are output to stdout.

7.7 --objdump-params

	Additional (?) params to objdump, e.g. for Intel syntax asm

7.8 --source or --assembly or both must specified

8. opstack
----------

8.1 overall

Callgraph viewer.

8.2 opstack <flags> <profile-spec>

8.1 -x, --exclude-dependent

	Exclude all dependent images from the profile. Not honored currently.

8.2 -m, --merge=cpu,lib,tid,tgid,unitmask,all

	see 5.21. Not honored currently

11. opcontrol
--------------

11.1 --image /foo/bar/a_binary[,/foo/bar/binary2,...]

	Allow to filter sample by image name. If a_binary is a primary binary
	all dependent are also profiled (iff --separate=library, ditto for
	--separate=kernel). If a_binary is a shared library or module (FIXME:
	module naming) only this dependent is profiled.
	 --separate=libray|kernel works in the usual way. Above rules are used
	too when --seperate=thread. FIXME: globbing disallowed currently, it's
	difficult to implement it efficiently