The idea behing profiling is to identify so called hot-spots in your program - ie those parts where the program spends most of its time.
When we know where they are, we can rework the code to make it run faster.
There's no point doubling the speed of code that runs 1% of the time because it won't be noticed. On the other hand, doubling the speed of code that runs, say, 20% of the time should save 10% on your program's execution time.
Here, we'll be looking to make p5c run faster. I'm running on linux,and will be using the profile module of valgrind to collect the profiling data, and kcachegrind to examine it. (These are 2 amazing programs.) You can adapt the techniques here to your own setup. Users of other systems might find alternative tools here
This puts the line numbers and symbol names in the compiled program and enables the results to be displayed in terms of program line numbers and function names (or procedure names).
Our prgram under test is p5c so to remove some confusion, we'll create a new version of p5c and call it p5c.t.pas
Replace p5c.t with your program name in what follows.
Put these 2 lines at the top of p5c.t.pas:
{$n+,d- -- force line numbers, omit debug code }
{@@# line 1 "p5c.pas" @@}
This ensures that the intermediate c file knows the line numbers of p5c.pas, and will allow gcc to put these into the generated binary image.
Now build the image:
$P5CDIR/pc -g p5c.t.pas
The -g option tells gcc to put line numbers and other symbolic information into the generated image, p5c.t
You may need to remove old profile data first:
rm -f callgrind.out.*
Now use valgrind to run your program.
It needs a command line of the form valgring options your-command command-line-args
In this example, the options specify the callgrind module.
The command is p5c.t and the arguments are respectively the pascal source file that we will compile, and the gcc intermediate file:
valgrind --tool=callgrind p5c.t iso7185-tests/iso7185pat.pas iso7185-tests/iso7185pat.c
With the bash shell, this can be abbreviated:
valgrind --tool=callgrind p5c.t iso7185-tests/iso7185pat.{pas,c}
Since valgring simulates p5c.t cycle-by-cycle, this could take a while. The payback is we get 100% accurate profiling.
We could run again, compiling different source files and have several sets of results if we wanted to.
Use kcachegrind for this. The results of the previous test or tests are in files called callgrind.out.<some-id></some-id>
kcachegrind callgrind.out.*
This shows the results in a page like
To be continued ...