What are the technical details of the different optimization steps for compilation? Is it possible that code runs slower one other x86 cpu's (cpu's of other family's/brands)?
Does "perform a number of minor optimizations" also gives problems with other cpu's?
What is:
- dependency generation?
- code profiling (for analysis)
What does "link an objective c program" when it's enabled? And "use heuristics to compile faster"? and "turn off all access checking?"
Sorry if this is a bit too much, but I'm really curious about these options, because I know they might fasten up my program (or let it crash).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Basically, viewed from a command line perspective, optimization goes from
-O0 = no optimization
-O3 = full optimization
Here is the pertinent portion of the man page on gcc
-O1 Optimize. Optimizing compilation takes somewhat more time, and a
lot more memory for a large function.
Without -O, the compiler's goal is to reduce the cost of compila-
tion and to make debugging produce the expected results. State-
ments are independent: if you stop the program with a breakpoint
between statements, you can then assign a new value to any variable
or change the program counter to any other statement in the func-
tion and get exactly the results you would expect from the source
code.
Without -O, the compiler only allocates variables declared "regis-
ter" in registers. The resulting compiled code is a little worse
than produced by PCC without -O.
With -O, the compiler tries to reduce code size and execution time.
When you specify -O, the compiler turns on -fthread-jumps and -fde-
fer-pop on all machines. The compiler turns on -fdelayed-branch on
machines that have delay slots, and -fomit-frame-pointer on
machines that can support debugging even without a frame pointer.
On some machines the compiler also turns on other flags.
-O2 Optimize even more. GCC performs nearly all supported optimiza-
tions that do not involve a space-speed tradeoff. The compiler
does not perform loop unrolling or function inlining when you spec-
ify -O2. As compared to -O, this option increases both compilation
time and the performance of the generated code.
-O2 turns on all optional optimizations except for loop unrolling,
function inlining, and register renaming. It also turns on the
-fforce-mem option on all machines and frame pointer elimination on
machines where doing so does not interfere with debugging.
Please note the warning under -fgcse about invoking -O2 on programs
that use computed gotos.
-O3 Optimize yet more. -O3 turns on all optimizations specified by -O2
and also turns on the -finline-functions and -frename-registers
options.
-O0 Do not optimize.
Dependency generation is the means by which the compiler figures out which code depends on which components and gets all of that set up to make.
In my experience, there are occasions in whcih going to -g3 hurts performance. Also note that getting the most out of certain processors, like the Pentium 4, requires looking into other options line
-mcpu=pentium4
Wayne
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Profiling gives you some info on how often several functions get called during execution, so that you may be able to optimize these.
Tools used for profiling are prof or gprof, but there are others (commercial ones) like one from Rational Rose (don't remember the name).
As far as I can see it, "optimisation" is more or less a generic term. The compiler will try to optimize, if it succeeds is writen somewhere else. Best optimisation for a specific platform can always be achieved by using platform specific compilers (like the Intel one).
By the way: Don't forget -Os (optimize for size) :)
upcase
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
What are the technical details of the different optimization steps for compilation? Is it possible that code runs slower one other x86 cpu's (cpu's of other family's/brands)?
Does "perform a number of minor optimizations" also gives problems with other cpu's?
What is:
- dependency generation?
- code profiling (for analysis)
What does "link an objective c program" when it's enabled? And "use heuristics to compile faster"? and "turn off all access checking?"
Sorry if this is a bit too much, but I'm really curious about these options, because I know they might fasten up my program (or let it crash).
Basically, viewed from a command line perspective, optimization goes from
-O0 = no optimization
-O3 = full optimization
Here is the pertinent portion of the man page on gcc
-O1 Optimize. Optimizing compilation takes somewhat more time, and a
lot more memory for a large function.
Without -O, the compiler's goal is to reduce the cost of compila-
tion and to make debugging produce the expected results. State-
ments are independent: if you stop the program with a breakpoint
between statements, you can then assign a new value to any variable
or change the program counter to any other statement in the func-
tion and get exactly the results you would expect from the source
code.
Without -O, the compiler only allocates variables declared "regis-
ter" in registers. The resulting compiled code is a little worse
than produced by PCC without -O.
With -O, the compiler tries to reduce code size and execution time.
When you specify -O, the compiler turns on -fthread-jumps and -fde-
fer-pop on all machines. The compiler turns on -fdelayed-branch on
machines that have delay slots, and -fomit-frame-pointer on
machines that can support debugging even without a frame pointer.
On some machines the compiler also turns on other flags.
-O2 Optimize even more. GCC performs nearly all supported optimiza-
tions that do not involve a space-speed tradeoff. The compiler
does not perform loop unrolling or function inlining when you spec-
ify -O2. As compared to -O, this option increases both compilation
time and the performance of the generated code.
-O2 turns on all optional optimizations except for loop unrolling,
function inlining, and register renaming. It also turns on the
-fforce-mem option on all machines and frame pointer elimination on
machines where doing so does not interfere with debugging.
Please note the warning under -fgcse about invoking -O2 on programs
that use computed gotos.
-O3 Optimize yet more. -O3 turns on all optimizations specified by -O2
and also turns on the -finline-functions and -frename-registers
options.
-O0 Do not optimize.
Dependency generation is the means by which the compiler figures out which code depends on which components and gets all of that set up to make.
In my experience, there are occasions in whcih going to -g3 hurts performance. Also note that getting the most out of certain processors, like the Pentium 4, requires looking into other options line
-mcpu=pentium4
Wayne
You can probably get some better answers than mine by checking out the gcc-help page:
http://gcc.gnu.org/ml/gcc-help/
Wayne
OOps, I mean -O3, not -g3 in my comment...
Wayne
Also, gcc manuals can be found here:
http://gcc.gnu.org/onlinedocs/
Wayne
Profiling gives you some info on how often several functions get called during execution, so that you may be able to optimize these.
Tools used for profiling are prof or gprof, but there are others (commercial ones) like one from Rational Rose (don't remember the name).
As far as I can see it, "optimisation" is more or less a generic term. The compiler will try to optimize, if it succeeds is writen somewhere else. Best optimisation for a specific platform can always be achieved by using platform specific compilers (like the Intel one).
By the way: Don't forget -Os (optimize for size) :)
upcase