Overhead Profiler Code
Brought to you by:
dinko05
File | Date | Author | Commit |
---|---|---|---|
example_output | 2011-08-12 |
![]() |
[207758] Initial commit |
LICENSE | 2011-08-12 |
![]() |
[207758] Initial commit |
Makefile_example | 2011-08-12 |
![]() |
[207758] Initial commit |
README | 2011-08-13 |
![]() |
[8b867a] Updated email address |
analyzeProfile.py | 2011-08-12 |
![]() |
[207758] Initial commit |
generateProfilerFile.py | 2011-08-12 |
![]() |
[207758] Initial commit |
profiler.cpp | 2011-08-12 |
![]() |
[207758] Initial commit |
profiler.h | 2011-08-12 |
![]() |
[207758] Initial commit |
## # License ## This software is released under the BSD license. See LICENSE for more information ## # Supporting the profiler ## To support the profiler, each thread must have a '_profilerData' object associated with it. This is done through the use a pthread specific key '_profilerThreadData'. You therefore need to: - Define this key in your main program and initialize it. This is accomplished with the following code %%%%% /* Declare this in main.cpp (or elsewhere in your program) */ #ifdef RUNTIME_PROFILER namespace rec { namespace profiler { void _destroyProfilerData(void *d) { delete (rec::profiler::_profilerData*)d; } pthread_key_t _profilerThreadData; } #endif /* In the main function (or during initialization) */ #ifdef RUNTIME_PROFILER pthread_key_create(&rec::profiler::_profilerThreadData, rec::profiler::_destroyProfilerData); #endif %%%%% Note that the '_destroyProfilerData' is very important as the outputing in a file occurs in the destructor of '_profilerData'. If you do not see anything in the output files, it is likely that this step was missed - Associate a '_profilerData' object to each thread. This is accomplished by adding the following to your thread's initialization function %%%%% #ifdef RUNTIME_PROFILER _profilerData *d = new _profilerData(); char buffer[15]; snprintf(buffer, 15, "profiler_%u", <THREAD_ID>); d->output = fopen(buffer, "w"); assert(d->output != NULL); pthread_setspecific(_profilerThreadData, (void*)d); #endif %%%%% Note that '<THREAD_ID>' should be replaced by a thread identifier. The analyzer relies on the fact that the name of the profiler files will be '<something>_<id>'. ## # Instrumenting your code ## The profiler provides the following macros: - START_PROFILE(name): causes 'name' to be associated with the block of code defined as the inner-most scope encompassing the macro. Note that 'name' should not contain quotes. For example, use 'START_PROFILE(main)' to associate the name 'main' with the code being profiled - PAUSE_PROFILE: causes timing measurements to be paused - RESUME_PROFILE: resumes timing measurements paused by 'PAUSE_PROFILE'. Should appear in pair with 'PAUSE_PROFILE' ## # Compiling your code ## Refer to 'Makefile_example' ## # Analyzing the profile ## Run the script 'analyzeProfile.py'. This script takes two parameters: - the name of the basename of the output files (set as 'profiler_' by default) - the number of threads to look at. One file is produced per thread and the analyzer will analyze each thread separately and also generate a summary over all threads. Run the script with the '-h' option for more information. The analyzer will output its analysis for each thread as well as for all threads combined. Each analysis is very similar to the output produced by gprof. It will first output a flat profile of all the blocks of code profiled. For each block of code it will output (in this order): - the percentage of total overhead time spent in this block of code (not including its children). The flat profile is ordered by this percentage. - the total time spent in this block of code and children - the total time spent in this block of code (not including its children). This is the time used to compute the percentage in the first column. - The number of calls to this block of code - The average time spent per call (including children) - The standard deviation on this average - The name of the block of code (set by 'START_PROFILER'). - An index in [] used in the call-graph profile It will also output a call-graph profile which gives more detail about how each block of code is called. This profile is ordered by the percentage of total overhead time spent in the block of code (INCLUDING its children). To clarify the notations, we will use the following example output: 62.128203 0.001531 32870326/54696727 shared_Crefwiththread [1] 41.737103 0.003633 21826400/54696727 shared_refwiththread [2] [ 3] 13.63 103.865350 0.005532 54696727 th_getthreadandway [3] 0.005532 0.000000 31/31 th_getthreadandwayPRIV [25] This is the third entry as indicated by the '[3]' in the first column. There are three parts: - the lines *above* the '[3]' line describe the parents - the '[3]' line describes the current entry - the lines *below* the '[3]' line describe the descendants For the lines describing the parents: - the first column is the amount of time spent in '[3]' (excluding its children) that comes from '[3]' being called by this parent - the second column is the amount of time spent in the children of '[3]' that comes from '[3]' being called by this parent - the third column is the number of times '[3]' was called from this parent - the fourth column is the name of the parent and its index For the line describing the entry: - the first column is its index - the second column is the percentage of total overhead time spent in this entry (including its children) - the third column is the total time spent inside this entry (excluding its children) - the fourth column is the total time spent inside its children - the fifth column is the total number of calls - the sixth colum is the entry's name and index For the lines describing the children: - the first column is the total time spent in this child when it is called from '[3]' - the second column is the total time spent in this child's children when it is called from '[3]' - the third column is the number of times this child is called from '[3]' - the fourth colum is the name of the child and its index An example of profiler files and an analysis of them is provided in 'example_output'. ## # Questions/Comments ## Please email me at 'dinko05@users.sourceforge.net' if you have any questions or comments. Put '[SF-Programs]' in the subject line.