Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Diff of /pp_interface [3a56f5] .. [f44fee] Maximize Restore

  Switch to side-by-side view

--- a/pp_interface
+++ b/pp_interface
@@ -1,8 +1,10 @@
 
-Draft version 3
+Draft version 4
  
 Beta.
  
+General FIXME: call graphs - for a later spec ?
+
 1. Basic principles
 -------------------
 
@@ -20,125 +22,129 @@
 2. Common basic options
 -----------------------
 
---help (-?)
+2.1 --help (-?)
 
 	As generated by popt.
 
---usage
+2.2 --usage
 
 	As generated by popt.
 
---version (-v) 
-
-	As currently, "op_summary: oprofile 0.3cvs compiled on May 15 2002 16:43:40"
-
---verbose (-V)
+2.3 --version (-v) 
+
+	As currently, "opsummary: oprofile 0.3cvs compiled on May 15 2002 16:43:40"
+
+2.4 --verbose (-V)
 
 	Maybe we can have more fine-grained --debug later, but for now this will do
 
---no-verbose
+2.5 --no-verbose
 
 	Default, for symmetry only 
 
---no-demangle (n/a)
+2.6 --no-demangle (n/a)
 
 	We demangle by default. This is for avoiding demangler bugs and the like.
 
---demangle
+2.7 --demangle
 
 	For symmetry only 
 
+2.8 FIXME: --smart-demangle - must make a decision
 
 3. Profile specifications
 -------------------------
 
 Each tool needs at least one sample file + associated binary image to operate on,
-and some tools (op_diff, op_summary) can have more than one. It would
+and some tools (opdiff, opsummary) can have more than one. It would
 be very useful for the user to be able to conveniently specify a subset of profiles,
 and we can use the same method for every tool.
 
-Each profile is a tuple of :
+3.1 Each profile is a tuple of :
 
 a) Session name
 b) Binary
 c) App binary (for shared libraries, e.g. /bin/ls using /lib/libc.so)
 d) Event name
 e) Event count
-f) parent process ID
-g) Process ID
+f) Group ID (tgid)
+g) Process ID (pid)
 h) CPU nr.
 i) the actual binary to bfd_open
 j) the actual samples file to use 
 
-Some of these may be "null" parameters, namely c),f),g),h),i),j) (and possibly e) depending
+3.2 Some of these may be "null" parameters, namely c),f),g),h),i),j) (and possibly e) depending
 on what we do about multiplexing).
 
 i) is a special case since it is not part of the profile as such. By default
-it is derived from b), but allowing its specification is needed for op_diff
+it is derived from b), but allowing its specification is needed for opdiff
 of two differently compiled binary + profile pairs. Similar goes for j)
 
 So if we provide a way to specify some set of profiles via the command line,
 we need to support each of these. My first idea is something like this :
 
-op_report session:run1 image:/usr/bin/oprofiled
+opreport session:run1 image:/usr/bin/oprofiled
 
 Here's a complete list of tags :
 
-	sample-file: <filename>
+3.3	sample-file: <filename>
 		An actual sample file. If this is given, no other tags are allowed,
 		with the exception of binary:
  
-	binary: <filename>
+3.4	binary: <filename>
 		The binary to use. This is incompatible with the image: parameter,
 		and may only be specified as part of a single-profile specification.
 		It may be used ONLY with sample-file:
 
-	session: <sessionlist>
+3.5	session: <sessionlist>
 		A comma-separated list of session names to resolve in. Absence of this
-		tag, unlike all others, means "the current session"
-
-	session-exclude: <sessionlist>
+		tag, unlike all others, means "the current session", equivalent to
+		specifying "session: current"
+
+3.6	session-exclude: <sessionlist>
 		A comma list of sessions to exclude ...
 	 
-	image: <imagelist>
+3.7	image: <imagelist>
 		Comma list of image names to resolve. Each entry may be relative
 		path, or wordexp() style name, or full path, e.g.
 	
 		'image:/usr/bin/oprofiled,*op*,./oprofpp'
 
-		"image:" is default tag, allowing old-style "op_report /usr/bin/oprofiled"
-
-	image-exclude: <imagelist> 
+		"image:" is default tag, allowing old-style "opreport /usr/bin/oprofiled"
+
+3.8	image-exclude: <imagelist> 
 		Comma list of images to exclude from the final list
 
-	lib-image: <imagelist>
+3.9	lib-image: <imagelist>
 		Comma list of library images to consider. Thus when looking for glibc samples
-		due to oprofiled, op_report /usr/bin/oprofiled 'lib-image:*libc*' will show only
+		due to oprofiled, opreport /usr/bin/oprofiled 'lib-image:*libc*' will show only
 		those. If not specified, these samples are NOT included by default, EXCEPT
 		in the presence of automerging if image:*libc*, for example.
 
-	lib-image-exclude: <imagelist>
+3.10	lib-image-exclude: <imagelist>
 		Similar to image-exclude ...
  
-	event: event
+3.11	event: event
 		Specify a particular event type e.g. CPU_CLK_UNHALTED
 
-	count: count
+3.12	count: count
 		Specify the count for that event e.g. 30000
 
-	unit-mask: mask
+3.13	unit-mask: mask
 		Specify the unit mask for that event. Encoded as numerical value.
 
-	cpu: <cpulist>
+3.14	cpu: <cpulist>
 		Comma-separated list of cpu numbers to consider
 
-	pid: <pidspeclist>
-		Comma list of pid's to consider. A pidspec is either a literal
-		process id (454), or a process id plus a '+', to indicate all
-		child processes (recursive) as well as that pid (e.g. "454+")
-
-If a tag is not specified, then it will match all values (with the exception of
-session:).
+3.15	pid: <pidlist>
+		Comma list of pid's to consider. This defines a thread group
+		including all child threads (task->tgid)
+
+3.16	tid: <tidlist>
+		A specific thread (task->pid)
+
+3.17 If a tag is not specified, then it will match all values (with the exception of
+     session: as noted).
  
 Note: we can use a common method for comma-separated lists :
 
@@ -153,35 +159,39 @@
 to the daemon too. But we must allow multiple event,count pairs
 so the syntax must be different.
 
-
-3a. sample file mangling
-------------------------
+3.18 FIXME: how to find 2.5 modules ? Also, -p/-P equivalents
+
+Sample file mangling
+--------------------
 
 We need to encode a number of the above things in the filename.
-The proposed format is
-
-$SAMPLES_DIR/sessionname/event.count.unitmask.ppid.pid.cpu.mangledprimaryname}mangledlibname
+
+3.19 The proposed format is
+
+$SAMPLES_DIR/session/path/to/binary/event.count.unitmask.tgid.pid.cpu
+$SAMPLES_DIR/session/path/to/binary/dependent/path/to/lib/event.count.unitmask.tgid.pid.cpu
+
+The latter is to be used when using --separate=lib|kernel
 
 (in decimal where relevant) 
 For example,
 
-CPU_CLK_UNHALTED.30000.0.434.436.0.}bin}ls}}lib}libc.so
-
-mangledlib/primaryname will be absolute and fully resolved pathnames
-
-The parts ppid,pid,cpu may, instead of a number, have the
-value "all", for when profiling method is not splitting on
-(p)pid/cpu
-
-mangledlibname may be ""
- 
-3b. Resolution algorithm
-------------------------
-
-session=as given or default to currentsession 
+$SAMPLES_DIR/current/bin/ls/dependent/lib/libc.so/CPU_CLK_UNHALTED.30000.0.434.434.0
+
+3.20 The paths must be fully resolved of symbolic links.
+
+3.21 The parts tgid,pid,cpu may, instead of a number, have the
+value "all", for when profiling method is not splitting
+
+3.22 The current session is defined as "current", all other sessions are free-form
+
+Resolution algorithm
+--------------------
+
+session=as given or default to "current"
 all others set to "*" (match on anything) by default
 
-for each (file in session(s)) {
+3.33 for each (file in session(s)) {
 	next if samplefile && samplefile!=file;
 	next if image && !included(file.image, imagelist)
 	next if image-exclude && included(file.image, image-exclude)
@@ -192,14 +202,13 @@
 	next if unitmask && file.unit_mask != unitmask
 	next if cpu && !included(file.cpu, cpulist)
 	next if pid && !included(file.pid, pidlist)
-	// if it's in the child tree of the give parent process id, accept it
-	next if parentpid && !transitive_include(file.ppid, parentpid-graph)
+	next if tid && !included(file.tid, tidlist)
  
 	add_to_profile_list();
 }
 
-if (one_profile_allowed_only) { 
-	// merge on ppid,pid,cpu
+3.23 if (one_profile_allowed_only) { 
+	// merge on tgid,pid,cpu
 	// and also lib-image, iff the lib-image
 	// is specified as imagename
 	merge
@@ -213,14 +222,14 @@
 The above is rather tedious syntax, and is likely to stay that way
 even with improvements. So we can have a command line tool / config file, e.g. :
 
-# cat .oprofile/aliases
+4.1 # cat .oprofile/aliases
 run1-oprofiled session:run1 image:/usr/bin/oprofiled
 
-# op_report run1-oprofiled
+4.2 # opreport run1-oprofiled
  
 for convenience. This is essentially a stored query facility ...
  
-5. op_report
+5. opreport
 ------------
 
 n�e oprofpp.
@@ -229,22 +238,32 @@
 The profile spec must resolve to exactly one profile, unless it is possible to
 auto-merge the resolved list from the profile spec (see below). 
 
-op_report <flags> <profile-spec>
+5.1 opreport <flags> <profile-spec>
 	--symbols (-l)
+	--debug (-b)
+	--details (-a)
 	--output <output spec> (-o) 
 	--threshold <threshold> (-t)
+	--include-dependent (-n)
 	--sort <sort spec> (-s)
 	--ignore-symbols <symbollist> (-i)
 	--exclude-symbols <symbollist> (-e)
 	<profile spec>
 
---symbols <symbol-list>
+5.2 --include-dependent
+
+	Include libs, kernel, modules, etc.
+
+5.3 --symbols <symbol-list>
 
 	This is the standard sorted summary produced by oprofpp, as specified
-by -i, -e, -s, -o. The default for op_report will be "op_report --symbols \*"
-or something similar.
- 
---output <outputspec>
+by -i, -e, -s, -o.
+
+	5.3.1 The default for opreport will be "opreport --symbols \*"
+
+	5.3.2 This must include --dependent symbols  when --include-dependent is set
+ 
+5.4 --output <outputspec>
 
     What fields to output in what order, as follows :
 
@@ -269,15 +288,34 @@
  
 0x8004434 433 34343 0.3% 4.5% blah(int) blah_Qv /home/steve.c:3 steve.c:3 /home/steve steve 
  
-The 'h' output gives a leading free-form title describing the profile, and
-column headers as appropriate.
- 
-For 'd', the columns are the same, except that the "symbol" fields may show
-xmalloc+0x34, for example
-
-'x' is reserved for now, and will be specified later.
- 
---sort <sortspec>
+5.5 The 'h' output gives a leading free-form title describing the profile, and
+    column headers as appropriate.
+ 
+5.6 For 'd', the columns are the same, except that the "symbol" fields may show
+    xmalloc+0x34, for example
+
+5.7 'x' is reserved for now, and will be specified later.
+
+5.8 The default format is hvspn. The following options modify this :
+
+5.9	--debug
+
+	Shows line number and file information (equivalent to 'hvspnL')
+
+5.10	--details
+
+	Shows line number and file information, and per-address values
+	(equivalent to 'hvspnLd')
+
+5.11	--include-dependent
+
+	Default format becomes 'hvspni'
+
+5.12 Specification of either --debug OR --details AND --output is a fatal error.
+
+5.13 Note: --help must output this information and document the default
+ 
+5.14 --sort <sortspec>
 
 	What field to sort by, as follows :
  
@@ -292,44 +330,44 @@
  
 	r	sort whatever in reverse
  
---ignore-symbols <symbollist>
+5.15 --ignore-symbols <symbollist>
 
 	comma-separated list of symbols to ignore. Ignored symbols are included
 	in percentage calculations etc, but not output
 
---exclude-symbols <symbollist>
+5.16 --exclude-symbols <symbollist>
 
 	comma-separated list of symbols to exclude. Excluded symbols are neither
 	included in calculations nor output
 
-We accept mangled or unmangled names, but not a mix of the two. First the search
-tries to find verbatim symbol names. If exactly 0 matches are found, then every
-the search is repeated, but each match is done against a demangled symbol name. Thus
-mixing mangled with unmangled names will not work, and should be documented.
- 
---threshold <threshold>
+5.17 We accept mangled or unmangled names, but not a mix of the two. First the search
+     tries to find verbatim symbol names. If exactly 0 matches are found, then every
+     the search is repeated, but each match is done against a demangled symbol name. Thus
+     mixing mangled with unmangled names will not work, and should be documented.
+ 
+5.18 --threshold <threshold>
 
 	Threshold of minimum values before an entry is printed. Either a sample count
 	or a percentage, e.g. "1", "0.1%"
  
-6. op_gprof
+6. opgprof
 -----------
 
 Split off the gprof dumping part of old oprofpp, since it didn't really
 fit.
 
-op_gprof
+6.1 opgprof
 	--gprof-file (-o) <file>
 	<profile spec>
 
-Default to gmon.out, making a back up if it exists already 
- 
-7. op_annotate
+6.2 Default to gmon.out, making a back up if it exists already 
+ 
+7. opannotate
 --------------
  
 n�e op_to_source. Our basic source / asm annotation tool.
 
-op_annotate
+7.1 opannotate
 	--source-dir <dir> (-d)
 	--output-dir <dir> (-o)
 	--base-dir <dir> (-b)
@@ -339,11 +377,12 @@
 	--source (-s)
 	--assembly (-a)
 	--mixed (-m)
+	--objdump-params (-p)
 	<profile spec>
 
-Profile spec must produce exactly 1 profile after auto-merge etc.
-
---source
+7.2 Profile spec must produce exactly 1 profile after auto-merge etc.
+
+7.3 --source
 
 	specifies the source-based per-file annotation. --source-dir is
 	used to find the source when debug info is relative. When --base-dir
@@ -354,7 +393,7 @@
 	Specifying a --source-dir == --output-dir is an error. Samples
 	in files outside of --source-dir are ignored with a message.
 
-The source annotation works by prefixing all lines with a certain size comment, e.g.
+7.4 The source annotation works by prefixing all lines with a certain size comment, e.g.
 
 /* 99 (3.2%)   */ char some_function(void)
 /*             */ {
@@ -363,30 +402,37 @@
 /* 73 (2.3%)   */      for (i = 0; i < 4; i++) {
 ...
  
-This is to keep the line numbers the same. A post fix comment is also
-added at the end of the file, along the lines of :
+7.5 This is to keep the line numbers the same. A post fix comment is also
+    added at the end of the file, along the lines of :
 
 /* Generated from /path/to/some_function.c
  * CPU_CLK_UNHALTED (cycles CPU is not halted) with a count of 30000 
  * blah blah
  */
  
---assembly
-
-	specifies the objdump annotated asm output
-
---mixed
+7.6 --assembly
+
+	specifies the objdump annotated asm output.  The output looks
+	as above.
+
+7.7 --mixed
 
 	specified the src/asm objdump annotated output
  
-8. op_summary
+7.8 --objdump-params
+
+	Additional (?) params to objdump, e.g. for Intel syntax asm
+
+8. opsummary
 -------------
 
 nee op_time. This is used for either global symbol- or image-based summaries
 inside of one session only.
 
-op_summary 
+8.1 opsummary 
 	--symbol-summary (-l)
+	--debug (-b)
+	--details (-a)
 	--output <output spec> (-o) 
 	--sort <sort spec> (-s)
 	--merge <merge spec>
@@ -397,17 +443,18 @@
 
 All as before.
 
---symbol-summary
+8.2 --symbol-summary
 
 Turns on symbol-based global summary. Make sure to have a 
-useful default output spec in these cases.
- 
---merge <merge spec>
+useful default output spec in these cases. --debug and --details
+are only available with this option.
+ 
+8.3 --merge <merge spec>
 
 Unlike the single-profile tools, it can be useful for us to not auto-merge
 profile specs giving more than one profile result. Two profiles of the same
 thing on CPU #0, and CPU #1, we might want to keep separate in the summary.  So
-we don't do it by default for op_summary, but allow the user to specify merges
+we don't do it by default for opsummary, but allow the user to specify merges
 as follows :
 
 	cpu
@@ -425,11 +472,13 @@
 	unitmask
 
 		Merge all compatible unit mask differences
- 
-9. op_diff
+
+8.4 FIXME: should be we consider an "all-kernel" (kernel + modules) profile ?
+ 
+9. opdiff
 ----------
 
-op_diff is a tool for comparing exactly two profile-executable
+opdiff is a tool for comparing exactly two profile-executable
 pairs, on a symbol by symbol level.
 
 Comparing global summaries is left for a future
@@ -437,7 +486,7 @@
  
 Of the general form :
 
-op_diff
+9.1 opdiff
 	--ignore-symbols <symbollist> (-i)
 	--exclude-symbols <symbollist> (-e)
 	--output <outputspec> (-o)
@@ -449,18 +498,18 @@
 auto-merging should take place as before, if needed. We also
 can use binary: in the profile spec.
 
-We auto-merge as with op_report, the lhs,rhs or the ":"
+9.2 We auto-merge as with opreport, the lhs,rhs or the ":"
 must each specify exactly one profile
  
-Two profile specs are incompatible iff
+9.3 Two profile specs are incompatible iff
 
 1. event differs or
 2. count differs
 
-NOTE: we consider all unit masks to be compatible, it is up
+9.4 NOTE: we consider all unit masks to be compatible, it is up
 to the user to do this sensibly.
  
---threshold
+9.5 --threshold
 
 	Slightly different. By default the threshold is applied
 	absolutely, so --threshold 1 does not hide symbols with -232
@@ -468,7 +517,7 @@
 	--threshold +0 hides all negative entries, and --threshold -0
 	hides all positive ones.
 
---output <outputspec>
+9.6 --output <outputspec>
 
     What fields to output in what order, as follows :
 
@@ -494,7 +543,7 @@
 
 0.2% (+0.1)
  
---sort <sortspec>
+9.7 --sort <sortspec>
 
 	What field to sort by, as follows :
  
@@ -524,13 +573,13 @@
 then we should automatically merge them into one profile for the purposes
 required, along with a note to the user of what we've done.
 
-Two profile specs are unmergable iff 
+10.1 Two profile specs are unmergable iff 
 
 1. event differs or
 2. count differs or
 3. image or lib-image differs
 
-Also consider :
+10.2 Also consider :
 
 	/lib/libc.so
 		/usr/bin/oprofiled
@@ -539,8 +588,8 @@
 
 We should expect something like :
 
-# op_report /lib/libc.so
-op_report: Merging the following profiles into this report :
+# opreport /lib/libc.so
+opreport: Merging the following profiles into this report :
 	/lib/libc.so (/usr/bin/oprofiled)
 	/lib/libc.so (/bin/ls)
 	/lib/libc.so (/usr/bin/apache)
@@ -549,8 +598,8 @@
 So the message would ideally indicate exactly the differing parts
 that got merged. (it is a matter of documentation + user education
 to tell them that if they want just the apache profile for libc,
-they do op_report /usr/bin/apache lib-image:/lib/libc.so )
-
-Sometimes we might not want to them to merge though, e.g. op_summary to
+they do opreport /usr/bin/apache lib-image:/lib/libc.so )
+
+Sometimes we might not want to them to merge though, e.g. opsummary to
 show the differing process IDs separately for a number of images. So that's
 why we have --merge where appropriate.