[Compbench-web-devel] compbenchmarks-web/tmp/doc compbenchmarks-bin-compbenchmarks-core-B.raw, NONE

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/compbench/compbenchmarks-web/tmp/doc
In directory sc8-pr-cvs4.sourceforge.net:/tmp/cvs-serv10458

Added Files:
	compbenchmarks-bin-compbenchmarks-core-B.raw 
	compbenchmarks-bin-compbenchmarks-core-h.raw 
	compbenchmarks-bin-compbenchmarks-core-mI.raw 
	compbenchmarks-bin-compbenchmarks-core-mU.raw 
	compbenchmarks-bin-compbenchmarks-core-qacA.raw 
	compbenchmarks-bin-compbenchmarks-core-qcgcc.raw 
	compbenchmarks-bin-compbenchmarks-core-qH.raw 
	compbenchmarks-bin-compbenchmarks-core-qip.raw 
	compbenchmarks-bin-compbenchmarks-plan-tut.raw 
	compbenchmarks-bin-compbenchmarks-plan-tut-run.raw 
	compbenchmarks-maint-compilers-summary.html 
	compbenchmarks-maint-compilers-zoom-g++-2.95.x.html 
	compbenchmarks-maint-compilers-zoom-g++-3.0.x.html 
	compbenchmarks-maint-compilers-zoom-g++-3.1.x.html 
	compbenchmarks-maint-compilers-zoom-gcc-2.95.x.html 
	compbenchmarks-maint-compilers-zoom-gcc-3.0.x.html 
	compbenchmarks-maint-compilers-zoom-gcc-3.1.x.html 
	compbenchmarks-maint-compilers-zoom-tcc-0.9.x.html 
Log Message:
For SF's TID #141819 : improving documentation.

--- NEW FILE: compbenchmarks-bin-compbenchmarks-core-B.raw ---
# <b>cd compbenchmarks-core && ./compbenchmarks-core -B linpackc-dp-roll gcc '-O3'</b><br>Benchmark&nbsp;:&nbsp;<br>
&nbsp;Configuring&nbsp;linpackc&nbsp;with&nbsp;compiler&nbsp;gcc&nbsp;and&nbsp;options&nbsp;-O3&nbsp;:&nbsp;OK<br>
&nbsp;Building&nbsp;linpackc&nbsp;:&nbsp;<br>
&nbsp;&nbsp;Build&nbsp;time&nbsp;:&nbsp;5.059946<br>
&nbsp;&nbsp;OK<br>
&nbsp;Benchmarking&nbsp;result(s)&nbsp;:&nbsp;<br>
&nbsp;&nbsp;Value=858333<br>
&nbsp;&nbsp;buildTime=5.059946<br>
&nbsp;&nbsp;execTime=0.027581<br>
&nbsp;&nbsp;Testsuite=0<br>
&nbsp;&nbsp;Tested=1<br>
&nbsp;&nbsp;OK<br>
&nbsp;OK<br>

--- NEW FILE: compbenchmarks-maint-compilers-zoom-gcc-2.95.x.html ---
<p><a href='/cgi-bin/doc.cgi?tab=compilers'>Back to compiler list</a>.</p>
<h2>gcc 2.95.x</h2>
<table summary='gcc-2.95.x options' class='supcomp'>
<tr class='l1'><td colspan='2' class='supcomp_head'>Globally disable compiler optimization</td></tr><tr class='l1'><td class='supcomp_opt'>-O0</td><td class='supcomp_detail'>Do not optimize. This is the default</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Globally optimize for size</td></tr><tr class='l2'><td class='supcomp_opt'>-Os</td><td class='supcomp_detail'>Optimize for size.  -Os enables all -O2 optimizations that do not typically
   increase code size.  It also performs further optimizations designed to
   reduce code size.

   -Os disables the following optimization flags: -falign-functions
   -falign-jumps  -falign-loops -falign-labels  -freorder-blocks
   -fprefetch-loop-arrays

   If you use multiple -O options, with or without level numbers, the last
   such option is the one that is effective.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Global optimization, level 1</td></tr><tr class='l1'><td class='supcomp_opt'>-O1</td><td class='supcomp_detail'>Optimize.  Optimizing compilation takes somewhat more time,  and
   a lot more memory for a large function.

   Without  `-O', the compiler's goal is to reduce the cost of
   compilation and to make debugging  produce  the  expected  results.
   Statements  are  independent:  if  you  stop  the program with a
   breakpoint between statements, you can then assign a  new  value
   to  any  variable  or  change  the  program counter to any other
   statement in the function and get exactly the results you  would
   expect from the source code.

   Without  `-O', only variables declared register are allocated in
   registers.  The resulting compiled code is a little  worse  than
   produced by PCC without `-O'.

   With  `-O', the compiler tries to reduce code size and execution
   time.

   When you specify `-O',  the  two  options  `-fthread-jumps'  and
   `-fdefer-pop' are turned on.  On machines that have delay slots,
   the `-fdelayed-branch' option is turned on.  For those  machines
   that  can  support  debugging  even without a frame pointer, the
   `-fomit-frame-pointer' option is turned on.   On  some  machines
   other flags may also be turned on.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Global optimization, level 2</td></tr><tr class='l2'><td class='supcomp_opt'>-O2</td><td class='supcomp_detail'>Optimize even more than -O1. Nearly all supported optimizations that
   do not involve a space-speed tradeoff are performed.  Loop unrolling
   and function inlining are not done, for example. As compared to -O, this
   option increases both compilation time and the performance of the
   generated code.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Global optimization, level 3</td></tr><tr class='l1'><td class='supcomp_opt'>-O3</td><td class='supcomp_detail'>Optimize  yet  more.  This turns on everything -O2 does, along with also
   turning on -finline-functions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Always scan through jump instructions in common subexpression elimination</td></tr><tr class='l2'><td class='supcomp_opt'>-fcse-follow-jumps</td><td class='supcomp_detail'>In common subexpression elimination, scan through jump  instructions
   when  the  target of the jump is not reached by any other
   path.  For example, when CSE encounters an if statement with  an else
   clause, CSE will follow the jump when the condition tested
   is false.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Always scan through conditionnal jump instructions in common subexpression elimination</td></tr><tr class='l1'><td class='supcomp_opt'>-fcse-skip-blocks</td><td class='supcomp_detail'>This is similar to `-fcse-follow-jumps', but causes CSE to  follow
   jumps  which  conditionally skip over blocks.  When CSE encounters
   a  simple  if   statement   with   no   else   clause,
   `-fcse-skip-blocks'  causes  CSE  to  follow the jump around the
   body of the if.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Re-run CSE after loop optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-frerun-cse-after-loop</td><td class='supcomp_detail'>Re-run common subexpression elimination after loop optimizations
   has been performed.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Integrate all simple functions into their callers</td></tr><tr class='l1'><td class='supcomp_opt'>-finline-functions</td><td class='supcomp_detail'>Integrate all simple functions into their callers.  The compiler
   heuristically decides which functions are simple  enough  to  be
   worth integrating in this way.

   If  all  calls to a given function are integrated, and the function
   is declared static, then GCC normally does not  output  the
   function as assembler code in its own right.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Loop optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-fstrength-reduce</td><td class='supcomp_detail'>Perform  the optimizations of loop strength reduction and
   elimination of iteration variables.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Jump shortcuts' detection</td></tr><tr class='l1'><td class='supcomp_opt'>-fthread-jumps</td><td class='supcomp_detail'>Perform optimizations where we check to see if a  jump  branches
   to  a location where another comparison subsumed by the first is
   found.  If so, the first branch is redirected to either the
   destination  of  the second branch or a point immediately following
   it, depending on whether the condition is known to  be  true  or
   false.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Global loop unrolling optimization</td></tr><tr class='l2'><td class='supcomp_opt'>-funroll-all-loops</td><td class='supcomp_detail'>Perform  the  optimization  of loop unrolling.  This is done for
   all loops.  This usually makes programs run more slowly.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Loop unrolling optimizations</td></tr><tr class='l1'><td class='supcomp_opt'>-funroll-loops</td><td class='supcomp_detail'>Perform  the  optimization of loop unrolling.  This is only done
   for loops whose number of iterations can be determined  at  com-
   pile time or run time.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Try allocating registers that'll be cloberred by function calls</td></tr><tr class='l2'><td class='supcomp_opt'>-fcaller-saves</td><td class='supcomp_detail'>Enable  values  to  be allocated in registers that will be
   clobbered by function calls, by emitting extra instructions to  save
   and restore the registers around such calls.  Such allocation is
   done only when it seems to result in better code than would otherwise
   be produced.

   This  option  is enabled by default on certain machines, usually
   those which have no call-preserved registers to use instead.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to increase slot utilisation</td></tr><tr class='l1'><td class='supcomp_opt'>-fdelayed-branch</td><td class='supcomp_detail'>If supported for the target machine, attempt to reorder instructions
   to  exploit  instruction  slots  available  after  delayed
   branch instructions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Perform a number of minor optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-fexpensive-optimizations</td><td class='supcomp_detail'>Perform a number of minor optimizations that are relatively expensive.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Violate ANSI or IEEE rules/specifications for optimizing code</td></tr><tr class='l1'><td class='supcomp_opt'>-ffast-math</td><td class='supcomp_detail'>This option allows GCC to violate some ANSI or IEEE rules/specifications
   in the interest of optimizing code for speed.  For example, it allows
   the compiler to assume arguments  to  the  sqrt
   function are non-negative numbers.

   This  option  should never be turned on by any `-O' option since
   it can result in incorrect output for programs which  depend  on
   an exact implementation of IEEE or ANSI rules/specifications for
   math functions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Do not store floating point variables in registers</td></tr><tr class='l2'><td class='supcomp_opt'>-ffloat-store</td><td class='supcomp_detail'>Do  not  store floating point variables in registers.  This prevents
    undesirable excess precision on machines such as the 68000
    where  the floating registers (of the 68881) keep more precision
    than a double is supposed to have.

    For most programs, the excess precision does only  good,  but  a
    few  programs  rely  on  the precise definition of IEEE floating
    point.  Use `-ffloat-store' for such programs.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Force memory address constants to be copied in registers</td></tr><tr class='l1'><td class='supcomp_opt'>-fforce-addr</td><td class='supcomp_detail'>Force memory address constants to be copied into  registers  before
   doing  arithmetic  on  them.  This may produce better code
   just as `-fforce-mem' may.  I am interested in hearing about the
   difference this makes.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Force memory operands to be copied into registers</td></tr><tr class='l2'><td class='supcomp_opt'>-fforce-mem</td><td class='supcomp_detail'>Force  memory  operands to be copied into registers before doing
   arithmetic on them.  This may produce better code by making  all
   memory  references  potential  common subexpressions.  When they
   are not common subexpressions,  instruction  combination  should
   eliminate  the separate register-load.  I am interested in hearing
   about the difference this makes.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Remove frame pointer from register when useless</td></tr><tr class='l1'><td class='supcomp_opt'>-fomit-frame-pointer</td><td class='supcomp_detail'>Don't  keep  the  frame pointer in a register for functions that
   don't need one.  This avoids the instructions to  save,  set  up
   and  restore  frame  pointers;  it  also makes an extra register
   available in many functions.  It also makes debugging impossible
   on most machines.

   On  some machines, such as the Vax, this flag has no effect, because
   the standard calling sequence  automatically  handles  the
   frame  pointer and nothing is saved by pretending it doesn't exist.
   The machine-description macro FRAME_POINTER_REQUIRED  controls
   whether a target machine supports this flag.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to eliminate stalls</td></tr><tr class='l2'><td class='supcomp_opt'>-fschedule-insns</td><td class='supcomp_detail'>If supported for the target machine, attempt to reorder instructions
   to eliminate execution stalls due to required  data  being
   unavailable.   This helps machines that have slow floating point
   or memory load instructions by allowing other instructions to be
   issued  until  the result of the load or floating point instruction
   is required.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to eliminate stalls (also watch for registers)</td></tr><tr class='l1'><td class='supcomp_opt'>-fschedule-insns2</td><td class='supcomp_detail'>Similar to `-fschedule-insns', but requests an  additional  pass
   of  instruction  scheduling  after  register allocation has been
   done.  This is especially useful on machines with  a  relatively
   small  number  of  registers  and where memory load instructions
   take more than one cycle.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Generate i486 optimized code</td></tr><tr class='l2'><td class='supcomp_opt'>-m486</td><td class='supcomp_detail'>Control whether or not code is optimized for a 486 instead of an
   386.  Code generated for a 486 will run on a 386 and vice versa.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Do not use FPU to return values</td></tr><tr class='l1'><td class='supcomp_opt'>-fno-fp-ret-in-387</td><td class='supcomp_detail'>Do not use the FPU registers for return values of functions.

   The  usual  calling  convention  has  functions return values of
   types float and double in an FPU register, even if there  is  no
   FPU.   The  idea  is that the operating system should emulate an
   FPU.

   The option `-mno-fp-ret-in-387' causes such  values  to  be
   returned in ordinary CPU registers instead.</td></tr>
</table>
--- NEW FILE: compbenchmarks-bin-compbenchmarks-core-qip.raw ---
# <b>cd compbenchmarks-core && ./compbenchmarks-core -qip</b><br>gzip&nbsp;1.2.4<br>
bzip2&nbsp;1.0.3<br>
nbench&nbsp;2.2.2<br>
scimark2&nbsp;2.0<br>
linpackc&nbsp;0.1.1<br>
benchpplinux&nbsp;1.1v5<br>

--- NEW FILE: compbenchmarks-bin-compbenchmarks-plan-tut-run.raw ---
<b>./compbenchmarks-plan --plan-use tut --run</b><br>
Restoring plan  :<br>
 Registering benchmark linpackc-dp-roll : OK<br>
 Registering option set  : OK<br>
 Registering option set opset1 : OK<br>
 Registering compiler gcc (GCC) 4.1.3 20070812 (prerelease) (Debian 4.1.2-15) : OK<br>
 OK<br>
Running benchmark plan  : 0.00%<br>
 Benchmark :<br>
  Cleaning linpackc : OK<br>
  Configuring linpackc with compiler gcc and options -O2 : OK<br>
  Building linpackc :<br>
   Build time : 1.962626<br>
   OK<br>
  Benchmarking result(s) :<br>
   Value=858226<br>
   buildTime=1.962626<br>
   execTime=0.028892<br>
   Testsuite=0<br>
   Tested=1<br>
   OK<br>
  OK<br>
 Registering plan  : OK<br>
 Registering plan  : 100.00%  - OK<br>
Results and settings (*.xml) in directory TUT-benchmark-results .<br>
You can now launch compbenchmarks-plan with --store <filename.tar.gz> to keep results.<br>
--- NEW FILE: compbenchmarks-bin-compbenchmarks-plan-tut.raw ---
<b>cd compbenchmarks-plan</b><br>
./compbenchmarks-plan --plan-register tut --output-directory TUT-benchmark-results --run-number 3<br>
./compbenchmarks-plan --plan-use tut --batch-register b1<br>
./compbenchmarks-plan --plan-use tut --batch-use b1 --compiler-register gcc<br>
./compbenchmarks-plan --plan-use tut --batch-use b1 --benchmark-register linpackc-dp-roll<br>
./compbenchmarks-plan --plan-use tut --batch-use b1 --optionset-register opset1<br>
./compbenchmarks-plan --plan-use tut --batch-use b1 --optionset-use opset1 --options-register oplist1<br>
./compbenchmarks-plan --plan-use tut --batch-use b1 --optionset-use opset1 --options-use oplist1 --option-register '-O2'<br>

--- NEW FILE: compbenchmarks-bin-compbenchmarks-core-qacA.raw ---
# <b>cd compbenchmarks-core && ./compbenchmarks-core -qa -c gcc -A '-O3 -finline-functions'</b><br>Compiler&nbsp;name&nbsp;:&nbsp;gcc&nbsp;(GCC)&nbsp;4.1.3&nbsp;20070812&nbsp;(prerelease)&nbsp;(Debian&nbsp;4.1.2-15)<br>
Version&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;4.1.3&nbsp;20070812&nbsp;(prerelease)<br>
Language&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;C<br>
Binary&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;gcc<br>
<br>
Descriptions&nbsp;from&nbsp;gcc,&nbsp;branch&nbsp;3.1.x<br>
Option(s)&nbsp;analyzed&nbsp;:&nbsp;-O3&nbsp;-finline-functions<br>
<br>
Option&nbsp;short&nbsp;descriptions&nbsp;:<br>
-O3&nbsp;:&nbsp;Global&nbsp;optimization,&nbsp;level&nbsp;3<br>
-finline-functions&nbsp;:&nbsp;Integrate&nbsp;all&nbsp;simple&nbsp;functions&nbsp;into&nbsp;their&nbsp;callers<br>
<br>
Analyze&nbsp;:&nbsp;<br>
&nbsp;*&nbsp;Option&nbsp;-finline-functions&nbsp;is&nbsp;implied&nbsp;by&nbsp;-O3<br>
<br>

--- NEW FILE: compbenchmarks-maint-compilers-zoom-g++-3.0.x.html ---
<p><a href='/cgi-bin/doc.cgi?tab=compilers'>Back to compiler list</a>.</p>
<h2>g++ 3.0.x</h2>
<table summary='g++-3.0.x options' class='supcomp'>
<tr class='l1'><td colspan='2' class='supcomp_head'>Globally disable compiler optimization</td></tr><tr class='l1'><td class='supcomp_opt'>-O0</td><td class='supcomp_detail'>Do not optimize. This is the default</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Globally optimize for size</td></tr><tr class='l2'><td class='supcomp_opt'>-Os</td><td class='supcomp_detail'>Optimize for size.  -Os enables all -O2 optimizations that do not typically
   increase code size.  It also performs further optimizations designed to
   reduce code size.

   -Os disables the following optimization flags: -falign-functions
   -falign-jumps  -falign-loops -falign-labels  -freorder-blocks
   -fprefetch-loop-arrays

   If you use multiple -O options, with or without level numbers, the last
   such option is the one that is effective.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Global optimization, level 1</td></tr><tr class='l1'><td class='supcomp_opt'>-O1</td><td class='supcomp_detail'>Optimize.  Optimizing compilation takes somewhat more time,  and
   a lot more memory for a large function.

   Without  `-O', the compiler's goal is to reduce the cost of
   compilation and to make debugging  produce  the  expected  results.
   Statements  are  independent:  if  you  stop  the program with a
   breakpoint between statements, you can then assign a  new  value
   to  any  variable  or  change  the  program counter to any other
   statement in the function and get exactly the results you  would
   expect from the source code.

   Without  `-O', only variables declared register are allocated in
   registers.  The resulting compiled code is a little  worse  than
   produced by PCC without `-O'.

   With  `-O', the compiler tries to reduce code size and execution
   time.

   When you specify `-O',  the  two  options  `-fthread-jumps'  and
   `-fdefer-pop' are turned on.  On machines that have delay slots,
   the `-fdelayed-branch' option is turned on.  For those  machines
   that  can  support  debugging  even without a frame pointer, the
   `-fomit-frame-pointer' option is turned on.   On  some  machines
   other flags may also be turned on.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Global optimization, level 2</td></tr><tr class='l2'><td class='supcomp_opt'>-O2</td><td class='supcomp_detail'>Optimize even more than -O1. Nearly all supported optimizations that
   do not involve a space-speed tradeoff are performed.  Loop unrolling
   and function inlining are not done, for example. As compared to -O, this
   option increases both compilation time and the performance of the
   generated code.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Global optimization, level 3</td></tr><tr class='l1'><td class='supcomp_opt'>-O3</td><td class='supcomp_detail'>Optimize  yet  more.  This turns on everything -O2 does, along with also
   turning on -finline-functions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Always scan through jump instructions in common subexpression elimination</td></tr><tr class='l2'><td class='supcomp_opt'>-fcse-follow-jumps</td><td class='supcomp_detail'>In common subexpression elimination, scan through jump  instructions
   when  the  target of the jump is not reached by any other
   path.  For example, when CSE encounters an if statement with  an else
   clause, CSE will follow the jump when the condition tested
   is false.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Always scan through conditionnal jump instructions in common subexpression elimination</td></tr><tr class='l1'><td class='supcomp_opt'>-fcse-skip-blocks</td><td class='supcomp_detail'>This is similar to `-fcse-follow-jumps', but causes CSE to  follow
   jumps  which  conditionally skip over blocks.  When CSE encounters
   a  simple  if   statement   with   no   else   clause,
   `-fcse-skip-blocks'  causes  CSE  to  follow the jump around the
   body of the if.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Re-run CSE after loop optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-frerun-cse-after-loop</td><td class='supcomp_detail'>Re-run common subexpression elimination after loop optimizations
   has been performed.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Integrate all simple functions into their callers</td></tr><tr class='l1'><td class='supcomp_opt'>-finline-functions</td><td class='supcomp_detail'>Integrate all simple functions into their callers.  The compiler
   heuristically decides which functions are simple  enough  to  be
   worth integrating in this way.

   If  all  calls to a given function are integrated, and the function
   is declared static, then GCC normally does not  output  the
   function as assembler code in its own right.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Loop optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-fstrength-reduce</td><td class='supcomp_detail'>Perform  the optimizations of loop strength reduction and
   elimination of iteration variables.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Jump shortcuts' detection</td></tr><tr class='l1'><td class='supcomp_opt'>-fthread-jumps</td><td class='supcomp_detail'>Perform optimizations where we check to see if a  jump  branches
   to  a location where another comparison subsumed by the first is
   found.  If so, the first branch is redirected to either the
   destination  of  the second branch or a point immediately following
   it, depending on whether the condition is known to  be  true  or
   false.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Global loop unrolling optimization</td></tr><tr class='l2'><td class='supcomp_opt'>-funroll-all-loops</td><td class='supcomp_detail'>Perform  the  optimization  of loop unrolling.  This is done for
   all loops.  This usually makes programs run more slowly.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Loop unrolling optimizations</td></tr><tr class='l1'><td class='supcomp_opt'>-funroll-loops</td><td class='supcomp_detail'>Perform  the  optimization of loop unrolling.  This is only done
   for loops whose number of iterations can be determined  at  com-
   pile time or run time.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Try allocating registers that'll be cloberred by function calls</td></tr><tr class='l2'><td class='supcomp_opt'>-fcaller-saves</td><td class='supcomp_detail'>Enable  values  to  be allocated in registers that will be
   clobbered by function calls, by emitting extra instructions to  save
   and restore the registers around such calls.  Such allocation is
   done only when it seems to result in better code than would otherwise
   be produced.

   This  option  is enabled by default on certain machines, usually
   those which have no call-preserved registers to use instead.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to increase slot utilisation</td></tr><tr class='l1'><td class='supcomp_opt'>-fdelayed-branch</td><td class='supcomp_detail'>If supported for the target machine, attempt to reorder instructions
   to  exploit  instruction  slots  available  after  delayed
   branch instructions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Perform a number of minor optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-fexpensive-optimizations</td><td class='supcomp_detail'>Perform a number of minor optimizations that are relatively expensive.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Violate ANSI or IEEE rules/specifications for optimizing code</td></tr><tr class='l1'><td class='supcomp_opt'>-ffast-math</td><td class='supcomp_detail'>This option allows GCC to violate some ANSI or IEEE rules/specifications
   in the interest of optimizing code for speed.  For example, it allows
   the compiler to assume arguments  to  the  sqrt
   function are non-negative numbers.

   This  option  should never be turned on by any `-O' option since
   it can result in incorrect output for programs which  depend  on
   an exact implementation of IEEE or ANSI rules/specifications for
   math functions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Do not store floating point variables in registers</td></tr><tr class='l2'><td class='supcomp_opt'>-ffloat-store</td><td class='supcomp_detail'>Do  not  store floating point variables in registers.  This prevents
    undesirable excess precision on machines such as the 68000
    where  the floating registers (of the 68881) keep more precision
    than a double is supposed to have.

    For most programs, the excess precision does only  good,  but  a
    few  programs  rely  on  the precise definition of IEEE floating
    point.  Use `-ffloat-store' for such programs.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Force memory address constants to be copied in registers</td></tr><tr class='l1'><td class='supcomp_opt'>-fforce-addr</td><td class='supcomp_detail'>Force memory address constants to be copied into  registers  before
   doing  arithmetic  on  them.  This may produce better code
   just as `-fforce-mem' may.  I am interested in hearing about the
   difference this makes.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Force memory operands to be copied into registers</td></tr><tr class='l2'><td class='supcomp_opt'>-fforce-mem</td><td class='supcomp_detail'>Force  memory  operands to be copied into registers before doing
   arithmetic on them.  This may produce better code by making  all
   memory  references  potential  common subexpressions.  When they
   are not common subexpressions,  instruction  combination  should
   eliminate  the separate register-load.  I am interested in hearing
   about the difference this makes.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Remove frame pointer from register when useless</td></tr><tr class='l1'><td class='supcomp_opt'>-fomit-frame-pointer</td><td class='supcomp_detail'>Don't  keep  the  frame pointer in a register for functions that
   don't need one.  This avoids the instructions to  save,  set  up
   and  restore  frame  pointers;  it  also makes an extra register
   available in many functions.  It also makes debugging impossible
   on most machines.

   On  some machines, such as the Vax, this flag has no effect, because
   the standard calling sequence  automatically  handles  the
   frame  pointer and nothing is saved by pretending it doesn't exist.
   The machine-description macro FRAME_POINTER_REQUIRED  controls
   whether a target machine supports this flag.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to eliminate stalls</td></tr><tr class='l2'><td class='supcomp_opt'>-fschedule-insns</td><td class='supcomp_detail'>If supported for the target machine, attempt to reorder instructions
   to eliminate execution stalls due to required  data  being
   unavailable.   This helps machines that have slow floating point
   or memory load instructions by allowing other instructions to be
   issued  until  the result of the load or floating point instruction
   is required.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to eliminate stalls (also watch for registers)</td></tr><tr class='l1'><td class='supcomp_opt'>-fschedule-insns2</td><td class='supcomp_detail'>Similar to `-fschedule-insns', but requests an  additional  pass
   of  instruction  scheduling  after  register allocation has been
   done.  This is especially useful on machines with  a  relatively
   small  number  of  registers  and where memory load instructions
   take more than one cycle.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>486 Optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-mcpu=i486</td><td class='supcomp_detail'>Assume the defaults for the machine type cpu-type when scheduling
   instructions.  The choices for cpu-type are i386, i486, i586, i686,
   pentium, pentiumpro, k6, and athlon

   While picking a specific cpu-type will schedule things appropri-
   ately for that particular chip, the compiler will not generate any
   code that does not run on the i386 without the -march=cpu-type
   option being used.  i586 is equivalent to pentium and i686 is
   equivalent to pentiumpro.  k6 is the AMD chip as opposed to the
   Intel ones.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Pentium optimizations</td></tr><tr class='l1'><td class='supcomp_opt'>-mcpu=pentium</td><td class='supcomp_detail'>Assume the defaults for the machine type cpu-type when scheduling
   instructions.  The choices for cpu-type are i386, i486, i586, i686,
   pentium, pentiumpro, k6, and athlon

   While picking a specific cpu-type will schedule things appropri-
   ately for that particular chip, the compiler will not generate any
   code that does not run on the i386 without the -march=cpu-type
   option being used.  i586 is equivalent to pentium and i686 is
   equivalent to pentiumpro.  k6 is the AMD chip as opposed to the
   Intel ones.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Pentium Pro optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-mcpu=pentiumpro</td><td class='supcomp_detail'>Assume the defaults for the machine type cpu-type when scheduling
   instructions.  The choices for cpu-type are i386, i486, i586, i686,
   pentium, pentiumpro, k6, and athlon

   While picking a specific cpu-type will schedule things appropri-
   ately for that particular chip, the compiler will not generate any
   code that does not run on the i386 without the -march=cpu-type
   option being used.  i586 is equivalent to pentium and i686 is
   equivalent to pentiumpro.  k6 is the AMD chip as opposed to the
   Intel ones.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>AMD K6 optimizations</td></tr><tr class='l1'><td class='supcomp_opt'>-mcpu=k6</td><td class='supcomp_detail'>Assume the defaults for the machine type cpu-type when scheduling
   instructions.  The choices for cpu-type are i386, i486, i586, i686,
   pentium, pentiumpro, k6, and athlon

   While picking a specific cpu-type will schedule things appropri-
   ately for that particular chip, the compiler will not generate any
   code that does not run on the i386 without the -march=cpu-type
   option being used.  i586 is equivalent to pentium and i686 is
   equivalent to pentiumpro.  k6 is the AMD chip as opposed to the
   Intel ones.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Perform optimizations in static single assignment form</td></tr><tr class='l2'><td class='supcomp_opt'>-fssa</td><td class='supcomp_detail'>Perform optimizations in static single assignment form.  Each function's
   flow graph is translated into SSA form, optimizations are
   performed, and the flow graph is translated back from SSA form.
   Users should not specify this option, since it is not yet ready for
   production use.
   According to -fdce description, this is probably an experimental feature.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Dead-code elimination in SSA form</td></tr><tr class='l1'><td class='supcomp_opt'>-fdce</td><td class='supcomp_detail'>Perform dead-code elimination in SSA form.  Requires -fssa.  Like
	-fssa, this is an experimental feature.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Disable start of functions offset alignment</td></tr><tr class='l2'><td class='supcomp_opt'>-falign-functions=1</td><td class='supcomp_detail'>Align the start of functions to the next power-of-two greater than
   n, skipping up to n bytes.  For instance, -falign-functions=32
   aligns functions to the next 32-byte boundary,
   but -falign-functions=24 would align to the next 32-byte boundary
   only if this can be done by skipping 23 bytes or less.

   -fno-align-functions and -falign-functions=1 are equivalent and
   mean that functions will not be aligned.

   Some assemblers only support this flag when n is a power of two; in
   that case, it is rounded up.

   If n is not specified, use a machine-dependent default.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Force start of functions alignment to machine default</td></tr><tr class='l1'><td class='supcomp_opt'>-falign-functions</td><td class='supcomp_detail'>Align the start of functions to the next power-of-two greater than
   n, skipping up to n bytes.  For instance, -falign-functions=32
   aligns functions to the next 32-byte boundary,
   but -falign-functions=24 would align to the next 32-byte boundary
   only if this can be done by skipping 23 bytes or less.

   -fno-align-functions and -falign-functions=1 are equivalent and
   mean that functions will not be aligned.

   Some assemblers only support this flag when n is a power of two; in
   that case, it is rounded up.

   If n is not specified, use a machine-dependent default.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Disable branch targets alignment</td></tr><tr class='l2'><td class='supcomp_opt'>-falign-labels=1</td><td class='supcomp_detail'>Align all branch targets to a power-of-two boundary, skipping up to
   n bytes like -falign-functions.  This option can easily make code
   slower, because it must insert dummy operations for when the branch
   target is reached in the usual flow of the code.

   If -falign-loops or -falign-jumps are applicable and are greater
   than this value, then their values are used instead.

   If n is not specified, use a machine-dependent default which is
   very likely to be 1, meaning no alignment.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Align branch targets on machine-default boundaries</td></tr><tr class='l1'><td class='supcomp_opt'>-falign-labels</td><td class='supcomp_detail'>Align all branch targets to a power-of-two boundary, skipping up to
   n bytes like -falign-functions.  This option can easily make code
   slower, because it must insert dummy operations for when the branch
   target is reached in the usual flow of the code.

   If -falign-loops or -falign-jumps are applicable and are greater
   than this value, then their values are used instead.

   If n is not specified, use a machine-dependent default which is
   very likely to be 1, meaning no alignment.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Disable loops alignment</td></tr><tr class='l2'><td class='supcomp_opt'>-falign-loops=1</td><td class='supcomp_detail'>Align loops to a power-of-two boundary, skipping up to n bytes like
   -falign-functions.  The hope is that the loop will be executed many
   times, which will make up for any execution of the dummy opera-
   tions.

   If n is not specified, use a machine-dependent default.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Set loops alignment to machine-default</td></tr><tr class='l1'><td class='supcomp_opt'>-falign-loops</td><td class='supcomp_detail'>Align loops to a power-of-two boundary, skipping up to n bytes like
   -falign-functions.  The hope is that the loop will be executed many
   times, which will make up for any execution of the dummy opera-
   tions.

   If n is not specified, use a machine-dependent default.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Disable branch target alignment</td></tr><tr class='l2'><td class='supcomp_opt'>-falign-jumps=1</td><td class='supcomp_detail'>Align branch targets to a power-of-two boundary, for branch targets
   where the targets can only be reached by jumping, skipping up to n
   bytes like -falign-functions.  In this case, no dummy operations
   need be executed.

   If n is not specified, use a machine-dependent default.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Enable target branch default alignment</td></tr><tr class='l1'><td class='supcomp_opt'>-falign-jumps</td><td class='supcomp_detail'>Align branch targets to a power-of-two boundary, for branch targets
   where the targets can only be reached by jumping, skipping up to n
   bytes like -falign-functions.  In this case, no dummy operations
   need be executed.

   If n is not specified, use a machine-dependent default.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Forces all general-induction variables in loops to be strength-reduced.</td></tr><tr class='l2'><td class='supcomp_opt'>-freduce-all-givs</td><td class='supcomp_detail'>Forces all general-induction variables in loops to be strength-
   reduced.

   Note: When compiling programs written in Fortran, -fmove-all-movables
   and -freduce-all-givs are enabled by default when you use the
   optimizer.

   These options may generate better or worse code; results are highly
   dependent on the structure of loops within the source code.

   These two options are intended to be removed someday, once they
   have helped determine the efficacy of various approaches to
   improving loop optimizations.

   Please let us (gc...@gc... and fo...@gn...) know how
   use of these options affects the performance of your production
   code.  We're very interested in code that runs slower when these
   options are enabled.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Assume the strictest aliasing rules to be applicable</td></tr><tr class='l1'><td class='supcomp_opt'>-fstrict-aliasing</td><td class='supcomp_detail'>Allows the compiler to assume the strictest aliasing rules
   applicable to the language being compiled.  For C (and C++), this
   activates optimizations based on the type of expressions.  In
   particular, an object of one type is assumed never to reside at the same
   address as an object of a different type, unless the types are
   almost the same.  For example, an "unsigned int" can alias an
   "int", but not a "void*" or a "double".  A character type may alias
   any other type.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Global common subexpression elimination pass</td></tr><tr class='l2'><td class='supcomp_opt'>-fgcse</td><td class='supcomp_detail'>Perform a global common subexpression elimination pass.  This pass
    also performs global constant and copy propagation.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Attempt to avoid false dependencies in scheduled code</td></tr><tr class='l1'><td class='supcomp_opt'>-frename-registers</td><td class='supcomp_detail'>Attempt to avoid false dependencies in scheduled code by making use
   of registers left over after register allocation.  This optimization
   will most benefit processors with lots of registers.  It can,
   however, make debugging impossible, since variables will no longer
   stay in a ``home register''.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Instrument arcs during compilation</td></tr><tr class='l2'><td class='supcomp_opt'>-fprofile-arcs</td><td class='supcomp_detail'>Here's GCC 3.1.x help (3.0.x's not given) :

   Instrument arcs during compilation to generate coverage data or for
   profile-directed block ordering.  During execution the program
   records how many times each branch is executed and how many times
   it is taken.  When the compiled program exits it saves this data to
   a file called sourcename.da for each source file.

   For profile-directed block ordering, compile the program with
   -fprofile-arcs plus optimization and code generation options, generate
   the arc profile information by running the program on a
   selected workload, and then compile the program again with the same
   optimization and code generation options plus -fbranch-probabilities.

   The other use of -fprofile-arcs is for use with "gcov", when it is
   used with the -ftest-coverage option.  GCC supports two methods of
   determining code coverage: the options that support "gcov", and
   options -a and -ax, which write information to text files.  The
   options that support "gcov" do not need to instrument every arc in
   the program, so a program compiled with them runs faster than a
   program compiled with -a, which adds instrumentation code to every
   basic block in the program.  The tradeoff: since "gcov" does not
   have execution counts for all branches, it must start with the execution
   counts for the instrumented branches, and then iterate over
   the program flow graph until the entire graph has been solved.
   Hence, "gcov" runs a little more slowly than a program which uses
   information from -a and -ax.

   With -fprofile-arcs, for each function of your program GCC creates
   a program flow graph, then finds a spanning tree for the graph.
   Only arcs that are not on the spanning tree have to be instrumented:
   the compiler adds code to count the number of times that
   these arcs are executed.  When an arc is the only exit or only
   entrance to a block, the instrumentation code can be added to the
   block; otherwise, a new basic block must be created to hold the
   instrumentation code.

   This option makes it possible to estimate branch probabilities and
   to calculate basic block execution counts.  In general, basic block
   execution counts as provided by -a do not give enough information
   to estimate all branch probabilities.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Optimizations based on path guessing</td></tr><tr class='l1'><td class='supcomp_opt'>-fbranch-probabilities</td><td class='supcomp_detail'>After running a program compiled with -fprofile-arcs, you can compile it a
   second time using -fbranch-probabilities, to improve
   optimizations based on guessing the path a branch might take.</td></tr>
</table>
--- NEW FILE: compbenchmarks-maint-compilers-zoom-gcc-3.1.x.html ---
<p><a href='/cgi-bin/doc.cgi?tab=compilers'>Back to compiler list</a>.</p>
<h2>gcc 3.1.x</h2>
<table summary='gcc-3.1.x options' class='supcomp'>
<tr class='l1'><td colspan='2' class='supcomp_head'>Globally disable compiler optimization</td></tr><tr class='l1'><td class='supcomp_opt'>-O0</td><td class='supcomp_detail'>Do not optimize. This is the default</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Globally optimize for size</td></tr><tr class='l2'><td class='supcomp_opt'>-Os</td><td class='supcomp_detail'>Optimize for size.  -Os enables all -O2 optimizations that do not typically
   increase code size.  It also performs further optimizations designed to
   reduce code size.

   -Os disables the following optimization flags: -falign-functions
   -falign-jumps  -falign-loops -falign-labels  -freorder-blocks
   -fprefetch-loop-arrays

   If you use multiple -O options, with or without level numbers, the last
   such option is the one that is effective.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Global optimization, level 1</td></tr><tr class='l1'><td class='supcomp_opt'>-O1</td><td class='supcomp_detail'>Optimize.  Optimizing compilation takes somewhat more time,  and
   a lot more memory for a large function.

   Without  `-O', the compiler's goal is to reduce the cost of
   compilation and to make debugging  produce  the  expected  results.
   Statements  are  independent:  if  you  stop  the program with a
   breakpoint between statements, you can then assign a  new  value
   to  any  variable  or  change  the  program counter to any other
   statement in the function and get exactly the results you  would
   expect from the source code.

   Without  `-O', only variables declared register are allocated in
   registers.  The resulting compiled code is a little  worse  than
   produced by PCC without `-O'.

   With  `-O', the compiler tries to reduce code size and execution
   time.

   When you specify `-O',  the  two  options  `-fthread-jumps'  and
   `-fdefer-pop' are turned on.  On machines that have delay slots,
   the `-fdelayed-branch' option is turned on.  For those  machines
   that  can  support  debugging  even without a frame pointer, the
   `-fomit-frame-pointer' option is turned on.   On  some  machines
   other flags may also be turned on.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Global optimization, level 2</td></tr><tr class='l2'><td class='supcomp_opt'>-O2</td><td class='supcomp_detail'>Optimize even more than -O1. Nearly all supported optimizations that
   do not involve a space-speed tradeoff are performed.  Loop unrolling
   and function inlining are not done, for example. As compared to -O, this
   option increases both compilation time and the performance of the
   generated code.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Global optimization, level 3</td></tr><tr class='l1'><td class='supcomp_opt'>-O3</td><td class='supcomp_detail'>Optimize  yet  more.  This turns on everything -O2 does, along with also
   turning on -finline-functions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Always scan through jump instructions in common subexpression elimination</td></tr><tr class='l2'><td class='supcomp_opt'>-fcse-follow-jumps</td><td class='supcomp_detail'>In common subexpression elimination, scan through jump  instructions
   when  the  target of the jump is not reached by any other
   path.  For example, when CSE encounters an if statement with  an else
   clause, CSE will follow the jump when the condition tested
   is false.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Always scan through conditionnal jump instructions in common subexpression elimination</td></tr><tr class='l1'><td class='supcomp_opt'>-fcse-skip-blocks</td><td class='supcomp_detail'>This is similar to `-fcse-follow-jumps', but causes CSE to  follow
   jumps  which  conditionally skip over blocks.  When CSE encounters
   a  simple  if   statement   with   no   else   clause,
   `-fcse-skip-blocks'  causes  CSE  to  follow the jump around the
   body of the if.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Re-run CSE after loop optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-frerun-cse-after-loop</td><td class='supcomp_detail'>Re-run common subexpression elimination after loop optimizations
   has been performed.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Integrate all simple functions into their callers</td></tr><tr class='l1'><td class='supcomp_opt'>-finline-functions</td><td class='supcomp_detail'>Integrate all simple functions into their callers.  The compiler
   heuristically decides which functions are simple  enough  to  be
   worth integrating in this way.

   If  all  calls to a given function are integrated, and the function
   is declared static, then GCC normally does not  output  the
   function as assembler code in its own right.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Loop optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-fstrength-reduce</td><td class='supcomp_detail'>Perform  the optimizations of loop strength reduction and
   elimination of iteration variables.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Jump shortcuts' detection</td></tr><tr class='l1'><td class='supcomp_opt'>-fthread-jumps</td><td class='supcomp_detail'>Perform optimizations where we check to see if a  jump  branches
   to  a location where another comparison subsumed by the first is
   found.  If so, the first branch is redirected to either the
   destination  of  the second branch or a point immediately following
   it, depending on whether the condition is known to  be  true  or
   false.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Global loop unrolling optimization</td></tr><tr class='l2'><td class='supcomp_opt'>-funroll-all-loops</td><td class='supcomp_detail'>Perform  the  optimization  of loop unrolling.  This is done for
   all loops.  This usually makes programs run more slowly.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Loop unrolling optimizations</td></tr><tr class='l1'><td class='supcomp_opt'>-funroll-loops</td><td class='supcomp_detail'>Perform  the  optimization of loop unrolling.  This is only done
   for loops whose number of iterations can be determined  at  com-
   pile time or run time.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Try allocating registers that'll be cloberred by function calls</td></tr><tr class='l2'><td class='supcomp_opt'>-fcaller-saves</td><td class='supcomp_detail'>Enable  values  to  be allocated in registers that will be
   clobbered by function calls, by emitting extra instructions to  save
   and restore the registers around such calls.  Such allocation is
   done only when it seems to result in better code than would otherwise
   be produced.

   This  option  is enabled by default on certain machines, usually
   those which have no call-preserved registers to use instead.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to increase slot utilisation</td></tr><tr class='l1'><td class='supcomp_opt'>-fdelayed-branch</td><td class='supcomp_detail'>If supported for the target machine, attempt to reorder instructions
   to  exploit  instruction  slots  available  after  delayed
   branch instructions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Perform a number of minor optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-fexpensive-optimizations</td><td class='supcomp_detail'>Perform a number of minor optimizations that are relatively expensive.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Violate ANSI or IEEE rules/specifications for optimizing code</td></tr><tr class='l1'><td class='supcomp_opt'>-ffast-math</td><td class='supcomp_detail'>This option allows GCC to violate some ANSI or IEEE rules/specifications
   in the interest of optimizing code for speed.  For example, it allows
   the compiler to assume arguments  to  the  sqrt
   function are non-negative numbers.

   This  option  should never be turned on by any `-O' option since
   it can result in incorrect output for programs which  depend  on
   an exact implementation of IEEE or ANSI rules/specifications for
   math functions.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Do not store floating point variables in registers</td></tr><tr class='l2'><td class='supcomp_opt'>-ffloat-store</td><td class='supcomp_detail'>Do  not  store floating point variables in registers.  This prevents
    undesirable excess precision on machines such as the 68000
    where  the floating registers (of the 68881) keep more precision
    than a double is supposed to have.

    For most programs, the excess precision does only  good,  but  a
    few  programs  rely  on  the precise definition of IEEE floating
    point.  Use `-ffloat-store' for such programs.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Force memory address constants to be copied in registers</td></tr><tr class='l1'><td class='supcomp_opt'>-fforce-addr</td><td class='supcomp_detail'>Force memory address constants to be copied into  registers  before
   doing  arithmetic  on  them.  This may produce better code
   just as `-fforce-mem' may.  I am interested in hearing about the
   difference this makes.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Force memory operands to be copied into registers</td></tr><tr class='l2'><td class='supcomp_opt'>-fforce-mem</td><td class='supcomp_detail'>Force  memory  operands to be copied into registers before doing
   arithmetic on them.  This may produce better code by making  all
   memory  references  potential  common subexpressions.  When they
   are not common subexpressions,  instruction  combination  should
   eliminate  the separate register-load.  I am interested in hearing
   about the difference this makes.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Remove frame pointer from register when useless</td></tr><tr class='l1'><td class='supcomp_opt'>-fomit-frame-pointer</td><td class='supcomp_detail'>Don't  keep  the  frame pointer in a register for functions that
   don't need one.  This avoids the instructions to  save,  set  up
   and  restore  frame  pointers;  it  also makes an extra register
   available in many functions.  It also makes debugging impossible
   on most machines.

   On  some machines, such as the Vax, this flag has no effect, because
   the standard calling sequence  automatically  handles  the
   frame  pointer and nothing is saved by pretending it doesn't exist.
   The machine-description macro FRAME_POINTER_REQUIRED  controls
   whether a target machine supports this flag.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to eliminate stalls</td></tr><tr class='l2'><td class='supcomp_opt'>-fschedule-insns</td><td class='supcomp_detail'>If supported for the target machine, attempt to reorder instructions
   to eliminate execution stalls due to required  data  being
   unavailable.   This helps machines that have slow floating point
   or memory load instructions by allowing other instructions to be
   issued  until  the result of the load or floating point instruction
   is required.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Attempt to reorder instructions to eliminate stalls (also watch for registers)</td></tr><tr class='l1'><td class='supcomp_opt'>-fschedule-insns2</td><td class='supcomp_detail'>Similar to `-fschedule-insns', but requests an  additional  pass
   of  instruction  scheduling  after  register allocation has been
   done.  This is especially useful on machines with  a  relatively
   small  number  of  registers  and where memory load instructions
   take more than one cycle.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>486 Optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-mcpu=i486</td><td class='supcomp_detail'>Assume the defaults for the machine type cpu-type when scheduling
   instructions.  The choices for cpu-type are i386, i486, i586, i686,
   pentium, pentiumpro, k6, and athlon

   While picking a specific cpu-type will schedule things appropri-
   ately for that particular chip, the compiler will not generate any
   code that does not run on the i386 without the -march=cpu-type
   option being used.  i586 is equivalent to pentium and i686 is
   equivalent to pentiumpro.  k6 is the AMD chip as opposed to the
   Intel ones.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Pentium optimizations</td></tr><tr class='l1'><td class='supcomp_opt'>-mcpu=pentium</td><td class='supcomp_detail'>Assume the defaults for the machine type cpu-type when scheduling
   instructions.  The choices for cpu-type are i386, i486, i586, i686,
   pentium, pentiumpro, k6, and athlon

   While picking a specific cpu-type will schedule things appropri-
   ately for that particular chip, the compiler will not generate any
   code that does not run on the i386 without the -march=cpu-type
   option being used.  i586 is equivalent to pentium and i686 is
   equivalent to pentiumpro.  k6 is the AMD chip as opposed to the
   Intel ones.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Pentium Pro optimizations</td></tr><tr class='l2'><td class='supcomp_opt'>-mcpu=pentiumpro</td><td class='supcomp_detail'>Assume the defaults for the machine type cpu-type when scheduling
   instructions.  The choices for cpu-type are i386, i486, i586, i686,
   pentium, pentiumpro, k6, and athlon

   While picking a specific cpu-type will schedule things appropri-
   ately for that particular chip, the compiler will not generate any
   code that does not run on the i386 without the -march=cpu-type
   option being used.  i586 is equivalent to pentium and i686 is
   equivalent to pentiumpro.  k6 is the AMD chip as opposed to the
   Intel ones.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>AMD K6 optimizations</td></tr><tr class='l1'><td class='supcomp_opt'>-mcpu=k6</td><td class='supcomp_detail'>Assume the defaults for the machine type cpu-type when scheduling
   instructions.  The choices for cpu-type are i386, i486, i586, i686,
   pentium, pentiumpro, k6, and athlon

   While picking a specific cpu-type will schedule things appropri-
   ately for that particular chip, the compiler will not generate any
   code that does not run on the i386 without the -march=cpu-type
   option being used.  i586 is equivalent to pentium and i686 is
   equivalent to pentiumpro.  k6 is the AMD chip as opposed to the
   Intel ones.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Perform optimizations in static single assignment form</td></tr><tr class='l2'><td class='supcomp_opt'>-fssa</td><td class='supcomp_detail'>Perform optimizations in static single assignment form.  Each function's
   flow graph is translated into SSA form, optimizations are
   performed, and the flow graph is translated back from SSA form.
   Users should not specify this option, since it is not yet ready for
   production use.
   According to -fdce description, this is probably an experimental feature.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Dead-code elimination in SSA form</td></tr><tr class='l1'><td class='supcomp_opt'>-fdce</td><td class='supcomp_detail'>Perform dead-code elimination in SSA form.  Requires -fssa.  Like
	-fssa, this is an experimental feature.</td></tr>
<tr class='l2'><td colspan='2' class='supcomp_head'>Disable start of functions offset alignment</td></tr><tr class='l2'><td class='supcomp_opt'>-falign-functions=1</td><td class='supcomp_detail'>Align the start of functions to the next power-of-two greater than
   n, skipping up to n bytes.  For instance, -falign-functions=32
   aligns functions to the next 32-byte boundary,
   but -falign-functions=24 would align to the next 32-byte boundary
   only if this can be done by skipping 23 bytes or less.

   -fno-align-functions and -falign-functions=1 are equivalent and
   mean that functions will not be aligned.

   Some assemblers only support this flag when n is a power of two; in
   that case, it is rounded up.

   If n is not specified, use a machine-dependent default.</td></tr>
<tr class='l1'><td colspan='2' class='supcomp_head'>Force start of functions alignment to machine default</td></tr><tr class='l1'><td class='supcomp_opt'>-falign-functions</td><td class='supcomp_detail'>Align the start of functions to the next power-of-two greater than
   n, skipping up to n bytes.  For instance, -falign-functions=32
   aligns functions to the next 32-byte boundary,
   but -falign-functions=24 would align to the next 32-byte boundary
   only if this ca...

[truncated message content]