SourceForge has been redesigned. Learn more.

[092ee5]: / TODO  Maximize  Restore  History

Download this file

165 lines (148 with data), 8.7 kB

This is an (incomplete) list of some of the stuff we want to look at doing.

If you're interested in hacking on any of these, please contact the list first
for some pointers and/or read HACKING and doc/CodingStyle.

0.8.2 release

 o amd64 32 bit build needs a sys32_lookup_dcookie() translator in the
 o op_bfd.cpp and separate debuginfo file: get_symbols_from_file() symbols
  outside .text section get a wrong offset
 o op_bfd.cpp:get_linenr() see FIXME about the use of ibfd/dbfd, needs test.
 o need --callgraph in oprof_start
 o with opstack I can get "warning: /no-vmlinux could not be found.".
   Should be smarter ?
 o opstack gives weird output for an image with no symbols:

  690      100.000  27697    100.000      (no symbols)
690       0.9446  27697    37.9188      (no symbols)
  4830     100.000  27697    100.000      (no symbols)
  0              0  0              0      memcpy
  0              0  0              0 __i686.get_pc_thunk.bx
  0              0  0              0 __errno_location
  0              0  0              0      (no symbols)
  0              0  0              0      (no symbols)

 o stress test opstack: compile a Big Application w/o frame pointer and look
   how driver and opstack react.

0.9 release

 o opstack output needs to be reworked to be more readable (see
   "callgraph support comments" on mailing list with Eric Sharkey)
 o opdiff

Before 1.0 little stuff

 o if ev67 and 2.4 x86-64 are not fixed, back them out
 o TRACE_END is no longer sent by the driver, keep it a bit for compatibility
 o callgraph patch: better way to skip ignored backtrace ?
 o child sample count for recursive function is not a bug but a feature, it
  needs documentation. (What does this refer to exactly?)
 o callgraph_container.cpp: the inner loop in add() is very fragile!
 o opcontrol --reset should avoid to reload the module if it's unloaded
 o oprofiled.log now contains various statistics about lost sample etc. from
  the driver. Post profile tools must parse that and warn eventually, warning
  must include a proposed work around. User need this: if nothing seems wrong
  people are unlikely to get a look in oprofiled.log (I ran oprofile on 2.6.1
  2 weeks before noticing at 30000 I lost a lot of samples, the profile seemed
  ok du to the randomization of lost samples). As developper we need that too,
  actually we have no clear idea of the behavior on different arch, NUMA etc.
  Not perfect because if the profiler is running the oprofiled.log will show
  those warning only after the first  alarm signal, I think we must dump the
  statistics information after each opcontrol --dump to avoid that.
 o libop/op_string.c is dumb (skip_ws, skip_nonws) we can't use them
  to parse a file and report correct line number error because skip_ws skip
  \n (so op_events.c report correct line number only if there is no empty
  line in the events files)
 o document OPROFILE_EVENT_FILES_DIR (isn't it for test purpose only ? need
   documentations ?)
 o odb_insert() can fail on ftruncate or mremap() in db_manage.c but we don't
  try to recover gracefully.
 o output column shortname headers for opreport -l
 o is relative_to_absolute_path guaranteeing a trailing '/' documented ?
 o move oprofiled.log to OP_SAMPLE_DIR/current ?
 o --buffer-size is useless on 2.5 without tuning of watershed
 o pp tools must handle samples count overflow (marked as (unsigned)-1)
 o the way we show kernel modules in 2.5 is not very obvious - "/oprofile"
 o oparchive will be more usefull with a --root= options to allow profiling
  on a small box, nfs mount / to another box and transfer sample file and
  binary on a bigger box for analysis. There is also a problem in oparchive
  you can use session: to get the right path to samples files but oprofiled.log
  and abi files path are hardcoded to /var/lib/oprofile.


 o the docs should mention the default event for each arch somewhere
 o more discussion of problematic code needs to go in the "interpreting" section. 
 o document gcc 2.95 and linenr info problems especially for inline functions
 o audit oprof_start for security + then document sudo
 o finish the internals manual

General checks to make
 o rgrep FIXME
 o valgrind (--show-reachable=yes --leak-check=yes)
 o audit to track unnecessary include <>
 o gcc 3.0/3.x compile
 o Qt2/3 check, no Qt check
 o verify builds (modversions, kernel versions, athlon etc.). I have the
  necessary stuff to check kernel versions/configurations on PIII core (Phil)
 o use nm and a little script to track unused function
 o test it to hell and back
 o compile all C++ programs with STL_port and test them (gcc 3.4 contain a
   debug mode too but std::string iterator are not checked)
 o There is probably place of post profile tools where looking at errno will give better error messages.

 o zwane problem with wrong text offset showed an interesting problem: if
  op_bfd.cpp get any symbol below text offset for vmlinux or a module then
  profile_t::sample_range() return a pair of iterator with first > second
  and we throw
 o lib-image: and image: behavior depend on --separate=, if --separate=library
  opreport "lib-image:*libc*" --merge=lib works but not
  opreport "image:*libc*" --merge=lib whilst the behavior is reversed if
  --separate==none. Must we take care ?
 o dependencies between profile_container.h symbol_container.h and
  sample_container.h become more and more ugly, I needed to include them
  in a specific order in some source.
 o add event aliases for common things like icache misses, we must start to 
  think about metrics including simple like event alias mapped to two or more
  events and intepreted specially by user space tools like using the ratio
  of samples; more tricky will be to select an event used as call count (no
  cg on it) and  used to emulate the call count field in gprof. I think this is
  a after 1.0 thing but event aliases must be specified in a way allowing such
 o oparchive could do with an --exclude-dependent option
 o do we need an opreport like opstack (showing caller/callee at binary
  boundary not symbols) ?
 o we should notice an opcontrol config change (--separate etc.) and
   auto-restart the daemon if necessary (Run)
 o we can add lots more unit tests yet
 o remove 2.2 / gcc 2.91 support ?
 o Itanium event constraints are not implemented
 o side-by-side opreport output (--compare - needs UI spec) ???
 o can we log samples going to anonymous mapping by using
  - 2.6 one fake sample for all anonymous sample, cookie == 0 mean use this
  special sample file. (cookie == -1 or -2 as magic value ?)
  - 2.4 we must pass to daemon note about exec anon mapping and create one
  fake samples file by anon mapping specially named like 
 o GUI still has a physical-counter interface, should have a general one
   like opcontrol --event
 o I think we should have the ability to have *fixed* width headers, e.g. :

vma      samples  cum. samples  %           cum. %     symbol name             image name              app name
0804c350 64582    64582         35.0757     35.0757    odb_insert              /usr/ /usr/local/oprofile-pp/bin/oprofiled

  Note the ellipsis
 o should we make the sighup handler re-read counter config and re-start profiling too ?
 o improve --smart-demangle
	o allow user to add it's own pattern in user.pat, document it.
	o hard code ${typename} regular definition to remove all current limitations (difficult, perhaps after 1.0 ?).
 o oprof_start dialog size is too small initially
 o oprof_start key movement through events doesn't change help text
 o i18n. We need a good formatter, and also remember format_percent()
 o opannotate --source --output-dir=~moz/op/ /usr/bin/oprofiled
   will fail because the ~ is not expanded (no space around it) (popt bug I say)
 o cpu names instead of numbers in 2.4 module/ ?
 o remove 1 and 2 magic numbers for oprof_ready
 o adapt Anton's patch for handling non-symbolled libraries ? (nowaday C++
  anon namespace symbol are static, 3.4 iirc, so with recent distro we are
  more likely to get problems with a "fallback to dynamic symbols" approch)
 o use standard C integer type <stdint.h> int32_t int16_t etc.
 o event multiplexing for real
 o XML output
 o profile the NMI handler code
 o merge sample files into one big report (like vtune can do repeated runs)