Re: [mpg123-devel] fixed point decoders
Brought to you by:
sobukus
|
From: Taihei M. <tm...@ma...> - 2009-06-01 13:26:17
|
On 2009/06/01, at 21:37, Thomas Orgis wrote: > Ah, now I can make sense out of this statement! I didn't look at a > non-sse build back then. > Indeed, the situation on the Atlon XP is funny. > Using -msse -mfpmath=sse in CFLAGS makes the truncation code > faster... but it's again faster when dropping the -mfpmath=sse (I > did use blank CFLAGS before, so basically i386 code). > Hm, so far the fastest build on the AthlonXP is achieved by using - > msse, but _not_ -mfpmath=sse, together with --enable-int-quality. > Fastest & good rounding... well, that's the generic code... the SSE > code for non-quality rounding is significantly faster than anything > else. The behavior when using -msse without --mfpmath=sse is: basic maths are done on x87 and fp-to-int truncations are done on SSE. And on Athlon, in most cases the x87 code is faster than the SSE code. So yeah, your findings are almost expected. > Yes... the core2 should run on SSE. What really buggers me a bit is > that the generic code is about 2.5 times faster on the 1.466 GHz > AthlonXP compared to my mobile 1.2Ghz Core2duo. How can the latter > be that lame? > I don't see the specs favouring the Athlon that much. Even the one > remaining memory channel (instead of two) of the Core2 should stil > be faster than what the Athlon has. > Well, with your x86-64 SSE code, the Core2 catches up a bit... the > AthlonXP's SSE being just about 2.1 times faster. It seems like > Intel did some interesting cuts with this Core2 to make it power > efficient -- makes one wonder how well a die-shrinked Atlhon XP > would fare against it:-/ Hmm... Generally 1.2ghz core2 should run faster than 1.4ghz athlon. Is your core2 running in down-clocked mode, isn't it? I had a experience that a Pentium M didn't get a full speed when running a very short benchmark. Thanks, Taihei Monma |