On Sun, 23 May 2010 11:06:24 +0200 Benjamin Zores <ben@...> said:
> On Sun, May 23, 2010 at 6:22 AM, Carsten Haitzler <raster@...>
> > On Sat, 22 May 2010 23:46:56 +0200 Benjamin Zores <ben@...> said:
> > any reason you use just -O not -O2?
> > fyi - no attached file - sf.net filtered it out. :) though generally this
> > kind of a problem is a result of compiler issues - eg no neon support in
> > gcc (or poor/older support) etc. generally anyway.
> It used to be -O4 actually. Then I just switch to -O just to ensure it
> was not triggered by any compiler optimization bug.
> Apparently this wasn't the case.
aaah gotcha. tho in my experience -O2 is about as good as it gets without using
-mtune/cpu etc to generate instructions for a specific architecture level or
tune for it. at least in my experience. i have a suggestion. dont go about -O2
unless you can really justify it - that means benchmarks show real solid
speedups (consistently more than 5%). in the past at least -O3 and above have
also been wonderful sources of compiler bugs, that produce incorrect code - and
you may end up suffering from bugs that don't actually exist in the code - but
lie in the compiler, so... just beware of high -O levels. test to be sure it
actually is worth it. do real benchmarking. remember it's a risk tradeoff - you
gain N% more speed for a higher chance of bugs (that cant actually be fixed in
the src - but in the compiler which makes it harder). thus why i say more than
5% - it is, of course, a fuzzy number, but it means "you need a real
significant and noticeable speedup".
for evas at any rate, use expedite and do benchmarks. better results come from
higher -c counts (loop count - so the more loops it does the less you'll be hit
by entropy - default is 128 - so use -c 256 if you are patient if you want very
accurate results). as an example - results for x86 (32bit) evas speed
(all of them used -march=nocona as well in CFLAGS - on a core2 duo laptop).
notice performance actually peaks at -O2 :) (also -O0 is horrible - but just to
note - i write a lot of my code to assume a compiler will have a decent
optimiser, so it will pick up the pieces with what may seem "stupid code", but
that "sutpid code" is meant to be more readable/maintainable
(need i get into -funroll-loops ?) :)
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) raster@...