Menu

Tree [8b867a] master /
 History

HTTPS access


File Date Author Commit
 example_output 2011-08-12 Romain Romain [207758] Initial commit
 LICENSE 2011-08-12 Romain Romain [207758] Initial commit
 Makefile_example 2011-08-12 Romain Romain [207758] Initial commit
 README 2011-08-13 Romain Romain [8b867a] Updated email address
 analyzeProfile.py 2011-08-12 Romain Romain [207758] Initial commit
 generateProfilerFile.py 2011-08-12 Romain Romain [207758] Initial commit
 profiler.cpp 2011-08-12 Romain Romain [207758] Initial commit
 profiler.h 2011-08-12 Romain Romain [207758] Initial commit

Read Me

##
# License
##

This software is released under the BSD license. See LICENSE for more
information


##
# Supporting the profiler
##

To support the profiler, each thread must have a '_profilerData' object
associated with it. This is done through the use a pthread specific
key '_profilerThreadData'.

You therefore need to:
	- Define this key in your main program and initialize it. This
		is accomplished with the following code
%%%%%
/* Declare this in main.cpp (or elsewhere in your program) */
#ifdef RUNTIME_PROFILER
namespace rec { namespace profiler {
	void _destroyProfilerData(void *d) {
		delete (rec::profiler::_profilerData*)d;
	}
	pthread_key_t _profilerThreadData;
}
#endif

/* In the main function (or during initialization) */
#ifdef RUNTIME_PROFILER
pthread_key_create(&rec::profiler::_profilerThreadData, rec::profiler::_destroyProfilerData);
#endif
%%%%%
		Note that the '_destroyProfilerData' is very important as the
		outputing in a file occurs in the destructor of '_profilerData'. If
		you do not see anything in the output files, it is likely that this step
		was missed

	- Associate a '_profilerData' object to each thread. This is
		accomplished by adding the following to your thread's initialization
		function
%%%%%
#ifdef RUNTIME_PROFILER
	_profilerData *d = new _profilerData();

	char buffer[15];
	snprintf(buffer, 15, "profiler_%u", <THREAD_ID>);
	d->output = fopen(buffer, "w");
	assert(d->output != NULL);

	pthread_setspecific(_profilerThreadData, (void*)d);
#endif
%%%%%
		Note that '<THREAD_ID>' should be replaced by a thread identifier. The analyzer
		relies on the fact that the name of the profiler files will be '<something>_<id>'.


##
# Instrumenting your code
##

The profiler provides the following macros:
	- START_PROFILE(name): causes 'name' to be associated
		with the block of code defined as the inner-most scope 
		encompassing the macro. Note that 'name' should not contain
		quotes. For example, use 'START_PROFILE(main)' to associate
		the name 'main' with the code being profiled
	- PAUSE_PROFILE: causes timing measurements to be paused
	- RESUME_PROFILE: resumes timing measurements paused by 'PAUSE_PROFILE'.
		Should appear in pair with 'PAUSE_PROFILE'


##
# Compiling your code
##

Refer to 'Makefile_example'


##
# Analyzing the profile
##

Run the script 'analyzeProfile.py'. This script takes two parameters:
	- the name of the basename of the output files (set as 'profiler_' by default)
	- the number of threads to look at. One file is produced per thread and the
		analyzer will analyze each thread separately and also generate a summary
		over all threads. 
Run the script with the '-h' option for more information.

The analyzer will output its analysis for each thread as well as for all
threads combined. Each analysis is very similar to the output produced by
gprof. 

It will first output a flat profile of all the blocks of code
profiled. For each block of code it will output (in this order):
	- the percentage of total overhead time spent in this block of code (not
		including its children). The flat profile is ordered by this
		percentage.
	- the total time spent in this block of code and children
	- the total time spent in this block of code (not including its children).
		This is the time used to compute the percentage in the first column.
	- The number of calls to this block of code
	- The average time spent per call (including children)
	- The standard deviation on this average
	- The name of the block of code (set by 'START_PROFILER'). 
	- An index in [] used in the call-graph profile

It will also output a call-graph profile which gives more detail about
how each block of code is called. This profile is ordered by the 
percentage of total overhead time spent in the block of code (INCLUDING
its children).

To clarify the notations, we will use the following example output:
                      62.128203        0.001531      32870326/54696727      shared_Crefwiththread [1]
                      41.737103        0.003633      21826400/54696727      shared_refwiththread [2]
[  3]     13.63      103.865350        0.005532        54696727             th_getthreadandway [3]
                       0.005532        0.000000            31/31            th_getthreadandwayPRIV [25]

This is the third entry as indicated by the '[3]' in the first column.
There are three parts:
	- the lines *above* the '[3]' line describe the parents
	- the '[3]' line describes the current entry
	- the lines *below* the '[3]' line describe the descendants

For the lines describing the parents:
	- the first column is the amount of time spent in '[3]' (excluding
		its children) that comes from '[3]' being called by this parent
	- the second column is the amount of time spent in the children
		of '[3]' that comes from '[3]' being called by this parent
	- the third column is the number of times '[3]' was called from
		this parent
	- the fourth column is the name of the parent and its index

For the line describing the entry:
	- the first column is its index
	- the second column is the percentage of total overhead time spent in
		this entry (including its children)
	- the third column is the total time spent inside this entry (excluding
		its children)
	- the fourth column is the total time spent inside its
		children
	- the fifth column is the total number of calls
	- the sixth colum is the entry's name and index

For the lines describing the children:
	- the first column is the total time spent in this child when
		it is called from '[3]'
	- the second column is the total time spent in this child's children
		when it is called from '[3]'
	- the third column is the number of times this child is called
		from '[3]'
	- the fourth colum is the name of the child and its index

An example of profiler files and an analysis of them is
provided in 'example_output'.


##
# Questions/Comments
##

Please email me at 'dinko05@users.sourceforge.net' if you have any questions or comments.
Put '[SF-Programs]' in the subject line.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.